Samad Abdus, Ajmal Amar, Mahmood Arif, Khurshid Beenish, Li Ping, Jan Syed Mansoor, Rehman Ashfaq Ur, He Pei, Abdalla Ashraf N, Umair Muhammad, Hu Junjian, Wadood Abdul
Department of Biochemistry, Abdul Wali Khan University, Mardan, KPK, Pakistan.
Center for Medical Genetics and Hunan Key Laboratory of Medical Genetics, School of Life Sciences, Central South University, Changsha, Hunan, China.
Front Mol Biosci. 2023 Mar 7;10:1060076. doi: 10.3389/fmolb.2023.1060076. eCollection 2023.
The new coronavirus SARS-COV-2, which emerged in late 2019 from Wuhan city of China was regarded as causing agent of the COVID-19 pandemic. The primary protease which is also known by various synonymous i.e., main protease, 3-Chymotrypsin-like protease (3CL) has a vital role in the replication of the virus, which can be used as a potential drug target. The current study aimed to identify novel phytochemical therapeutics for 3CL by machine learning-based virtual screening. A total of 4,000 phytochemicals were collected from deep literature surveys and various other sources. The 2D structures of these phytochemicals were retrieved from the PubChem database, and with the use of a molecular operating environment, 2D descriptors were calculated. Machine learning-based virtual screening was performed to predict the active phytochemicals against the SARS-CoV-2 3CL. Random forest achieved 98% accuracy on the train and test set among the different machine learning algorithms. Random forest model was used to screen 4,000 phytochemicals which leads to the identification of 26 inhibitors against the 3CL. These hits were then docked into the active site of 3CL. Based on docking scores and protein-ligand interactions, MD simulations have been performed using 100 ns for the top 5 novel inhibitors, ivermectin, and the APO state of 3CL. The post-dynamic analysis i.e,. Root means square deviation (RMSD), Root mean square fluctuation analysis (RMSF), and MM-GBSA analysis reveal that our newly identified phytochemicals form significant interactions in the binding pocket of 3CL and form stable complexes, indicating that these phytochemicals could be used as potential antagonists for SARS-COV-2.
2019年末在中国武汉市出现的新型冠状病毒SARS-CoV-2被认为是导致COVID-19大流行的病原体。主要蛋白酶也有多种同义名称,即主要蛋白酶、3-胰凝乳蛋白酶样蛋白酶(3CL),在病毒复制中起着至关重要的作用,可作为潜在的药物靶点。当前的研究旨在通过基于机器学习的虚拟筛选来识别针对3CL的新型植物化学疗法。通过深入的文献调研和其他各种来源共收集了4000种植物化学物质。这些植物化学物质的二维结构从PubChem数据库中检索得到,并使用分子操作环境计算二维描述符。进行基于机器学习的虚拟筛选以预测针对SARS-CoV-2 3CL的活性植物化学物质。在不同的机器学习算法中,随机森林在训练集和测试集上的准确率达到了98%。使用随机森林模型对4000种植物化学物质进行筛选,从而鉴定出26种针对3CL的抑制剂。然后将这些命中物对接至3CL的活性位点。基于对接分数和蛋白质-配体相互作用,对排名前5的新型抑制剂、伊维菌素以及3CL的脱辅基状态进行了100纳秒的分子动力学模拟。动力学后分析,即均方根偏差(RMSD)、均方根波动分析(RMSF)和MM-GBSA分析表明,我们新鉴定的植物化学物质在3CL的结合口袋中形成了显著的相互作用并形成了稳定的复合物,这表明这些植物化学物质可作为SARS-CoV-2的潜在拮抗剂。