Gadiya Yojana, Genilloud Olga, Bilitewski Ursula, Brönstrup Mark, von Berlin Leonie, Attwood Marie, Gribbon Philip, Zaliani Andrea
Fraunhofer Institute for Translational Medicine and Pharmacology (ITMP), Schnackenburgallee 114, Hamburg 22525, Germany.
Bonn-Aachen International Center for Information Technology (B-IT), University of Bonn, Bonn 53113, Germany.
J Chem Inf Model. 2025 Mar 10;65(5):2416-2431. doi: 10.1021/acs.jcim.4c02347. Epub 2025 Feb 23.
While the useful armory of antibiotic drugs is continually depleted due to the emergence of drug-resistant pathogens, the development of novel therapeutics has also slowed down. In the era of advanced computational methods, approaches like machine learning (ML) could be one potential solution to help reduce the high costs and complexity of antibiotic drug discovery and attract collaboration across organizations. In our work, we developed a large antimicrobial knowledge graph (AntiMicrobial-KG) as a repository for collecting and visualizing public antibacterial assay. Utilizing this data, we build ML models to efficiently scan compound libraries to identify compounds with the potential to exhibit antimicrobial activity. Our strategy involved training seven classic ML models across six compound fingerprint representations, of which the Random Forest trained on the MHFP6 fingerprint outperformed, demonstrating an accuracy of 75.9% and Cohen's Kappa score of 0.68. Finally, we illustrated the model's applicability for predicting the antimicrobial properties of two small molecule screening libraries. First, the EU-OpenScreen library was tested against a panel of Gram-positive, Gram-negative, and Fungal pathogens. Here, we unveiled that the model was able to correctly predict more than 30% of active compounds for Gram-positive, Gram-negative, and Fungal pathogens. Second, with the Enamine library, a commercially available HTS compound collection with claimed antibacterial properties, we predicted its antimicrobial activity and pathogen class specificity. These results may provide a means for accelerating research in AMR drug discovery efforts by carefully filtering out compounds from commercial libraries with lower chances of being active.
由于耐药病原体的出现,抗生素药物这一有用的武器库在不断减少,新型疗法的开发也放缓了。在先进计算方法的时代,机器学习(ML)等方法可能是一种潜在的解决方案,有助于降低抗生素药物发现的高成本和复杂性,并吸引各组织之间的合作。在我们的工作中,我们开发了一个大型抗菌知识图谱(AntiMicrobial-KG)作为收集和可视化公共抗菌试验的存储库。利用这些数据,我们构建了ML模型,以有效地扫描化合物库,识别具有抗菌活性潜力的化合物。我们的策略包括在六种化合物指纹表示上训练七个经典ML模型,其中在MHFP6指纹上训练的随机森林表现最佳,准确率为75.9% , 科恩卡帕系数为0.68。最后,我们说明了该模型在预测两个小分子筛选库抗菌特性方面的适用性。首先,针对一组革兰氏阳性、革兰氏阴性和真菌病原体对欧盟开放筛选库进行了测试。在此,我们发现该模型能够正确预测革兰氏阳性、革兰氏阴性和真菌病原体中超过30%的活性化合物。其次,对于拥有声称具有抗菌特性的市售高通量筛选化合物集合的Enamine库,我们预测了其抗菌活性和病原体类别特异性。这些结果可能为加速抗菌药物耐药性药物发现的研究提供一种手段,即通过仔细从商业库中筛选出活性可能性较低的化合物。