Gutkin Evgeny, Gusev Filipp, Gentile Francesco, Ban Fuqiang, Koby S Benjamin, Narangoda Chamali, Isayev Olexandr, Cherkasov Artem, Kurnikova Maria G
Department of Chemistry, Mellon College of Science, Carnegie Mellon University Pittsburgh PA 15213 USA
Computational Biology Department, School of Computer Science, Carnegie Mellon University Pittsburgh PA 15213 USA.
Chem Sci. 2024 Apr 11;15(23):8800-8812. doi: 10.1039/d3sc06880c. eCollection 2024 Jun 12.
The Critical Assessment of Computational Hit-Finding Experiments (CACHE) Challenge series is focused on identifying small molecule inhibitors of protein targets using computational methods. Each challenge contains two phases, hit-finding and follow-up optimization, each of which is followed by experimental validation of the computational predictions. For the CACHE Challenge #1, the Leucine-Rich Repeat Kinase 2 (LRRK2) WD40 Repeat (WDR) domain was selected as the target for hit-finding and optimization. Mutations in LRRK2 are the most common genetic cause of the familial form of Parkinson's disease. The LRRK2 WDR domain is an understudied drug target with no known molecular inhibitors. Herein we detail the first phase of our winning submission to the CACHE Challenge #1. We developed a framework for the high-throughput structure-based virtual screening of a chemically diverse small molecule space. Hit identification was performed using the large-scale Deep Docking (DD) protocol followed by absolute binding free energy (ABFE) simulations. ABFEs were computed using an automated molecular dynamics (MD)-based thermodynamic integration (TI) approach. 4.1 billion ligands from Enamine REAL were screened with DD followed by ABFEs computed by MD TI for 793 ligands. 76 ligands were prioritized for experimental validation, with 59 compounds successfully synthesized and 5 compounds identified as hits, yielding a 8.5% hit rate. Our results demonstrate the efficacy of the combined DD and ABFE approaches for hit identification for a target with no previously known hits. This approach is widely applicable for the efficient screening of ultra-large chemical libraries as well as rigorous protein-ligand binding affinity estimation leveraging modern computational resources.
计算命中发现实验关键评估(CACHE)挑战赛系列专注于使用计算方法识别蛋白质靶点的小分子抑制剂。每个挑战赛包含两个阶段,即命中发现和后续优化,每个阶段之后都会对计算预测进行实验验证。对于CACHE挑战赛#1,富含亮氨酸重复激酶2(LRRK2)的WD40重复(WDR)结构域被选为命中发现和优化的靶点。LRRK2中的突变是家族性帕金森病最常见的遗传病因。LRRK2 WDR结构域是一个研究较少的药物靶点,尚无已知的分子抑制剂。在此,我们详细介绍了我们在CACHE挑战赛#1中获胜提交作品的第一阶段。我们开发了一个基于结构的高通量虚拟筛选框架,用于筛选化学性质多样的小分子空间。使用大规模深度对接(DD)协议进行命中识别,随后进行绝对结合自由能(ABFE)模拟。使用基于分子动力学(MD)的自动热力学积分(TI)方法计算ABFE。用DD筛选了来自Enamine REAL的41亿个配体,随后对793个配体进行了MD TI计算的ABFE。76个配体被优先进行实验验证,成功合成了59种化合物,其中5种化合物被确定为命中物,命中率为8.5%。我们的结果证明了DD和ABFE组合方法对于一个此前无已知命中物的靶点进行命中识别的有效性。这种方法广泛适用于高效筛选超大化学文库以及利用现代计算资源进行严格的蛋白质-配体结合亲和力估计。