Suppr超能文献

基于靶标特异性机器学习打分函数的结构虚拟筛选方法提高了 SARS-CoV-2 药物研发的效率。

Target-Specific Machine Learning Scoring Function Improved Structure-Based Virtual Screening Performance for SARS-CoV-2 Drugs Development.

机构信息

State Key Laboratory for Conservation and Utilization of Subtropical Agro-Bioresources, College of Life Science and Technology, Guangxi University, Nanning 530004, China.

National Key Laboratory of Crop Genetic Improvement, Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, China.

出版信息

Int J Mol Sci. 2022 Sep 20;23(19):11003. doi: 10.3390/ijms231911003.

Abstract

Leveraging machine learning has been shown to improve the accuracy of structure-based virtual screening. Furthermore, a tremendous amount of empirical data is publicly available, which further enhances the performance of the machine learning approach. In this proof-of-concept study, the 3CL enzyme of SARS-CoV-2 was used. Structure-based virtual screening relies heavily on scoring functions. It is widely accepted that target-specific scoring functions may perform more effectively than universal scoring functions in real-world drug research and development processes. It would be beneficial to drug discovery to develop a method that can effectively build target-specific scoring functions. In the current study, the bindingDB database was used to retrieve experimental data. Smina was utilized to generate protein-ligand complexes for the extraction of InteractionFingerPrint (IFP) and SimpleInteractionFingerPrint SIFP fingerprints via the open drug discovery tool (oddt). The present study found that randomforestClassifier and randomforestRegressor performed well when used with the above fingerprints along the Molecular ACCess System (MACCS), Extended Connectivity Fingerprint (ECFP4), and ECFP6. It was found that the area under the precision-recall curve was 0.80, which is considered a satisfactory level of accuracy. In addition, our enrichment factor analysis indicated that our trained scoring function ranked molecules correctly compared to smina's generic scoring function. Further molecular dynamics simulations indicated that the top-ranked molecules identified by our developed scoring function were highly stable in the active site, supporting the validity of our developed process. This research may provide a template for developing target-specific scoring functions against specific enzyme targets.

摘要

利用机器学习已被证明可以提高基于结构的虚拟筛选的准确性。此外,大量的经验数据是公开的,这进一步提高了机器学习方法的性能。在这项概念验证研究中,使用了 SARS-CoV-2 的 3CL 酶。基于结构的虚拟筛选严重依赖于评分函数。人们普遍认为,在实际的药物研发过程中,针对特定目标的评分函数可能比通用评分函数更有效。开发一种能够有效构建针对特定目标的评分函数的方法将对药物发现有益。在本研究中,使用了 bindingDB 数据库来检索实验数据。Smina 用于生成蛋白-配体复合物,通过开放药物发现工具(oddt)提取 InteractionFingerPrint(IFP)和 SimpleInteractionFingerPrint SIFP 指纹。本研究发现,随机森林分类器和随机森林回归器在使用上述指纹和 Molecular ACCess System(MACCS)、Extended Connectivity Fingerprint(ECFP4)和 ECFP6 时表现良好。发现精度-召回曲线下的面积为 0.80,这被认为是一个令人满意的准确度水平。此外,我们的富集因子分析表明,与 smina 的通用评分函数相比,我们训练的评分函数正确地对分子进行了排名。进一步的分子动力学模拟表明,我们开发的评分函数所确定的排名靠前的分子在活性部位非常稳定,支持了我们开发过程的有效性。这项研究可能为针对特定酶靶标的特定目标评分函数的开发提供了一个模板。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/01b9/9570399/1ffd5d46d4b7/ijms-23-11003-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验