Brylinski Michal
Department of Biological Sciences, Louisiana State University, Baton Rouge, Louisiana, United States of America; Center for Computation & Technology, Louisiana State University, Baton Rouge, Louisiana, United States of America.
PLoS Comput Biol. 2014 Sep 18;10(9):e1003829. doi: 10.1371/journal.pcbi.1003829. eCollection 2014 Sep.
Detecting similarities between ligand binding sites in the absence of global homology between target proteins has been recognized as one of the critical components of modern drug discovery. Local binding site alignments can be constructed using sequence order-independent techniques, however, to achieve a high accuracy, many current algorithms for binding site comparison require high-quality experimental protein structures, preferably in the bound conformational state. This, in turn, complicates proteome scale applications, where only various quality structure models are available for the majority of gene products. To improve the state-of-the-art, we developed eMatchSite, a new method for constructing sequence order-independent alignments of ligand binding sites in protein models. Large-scale benchmarking calculations using adenine-binding pockets in crystal structures demonstrate that eMatchSite generates accurate alignments for almost three times more protein pairs than SOIPPA. More importantly, eMatchSite offers a high tolerance to structural distortions in ligand binding regions in protein models. For example, the percentage of correctly aligned pairs of adenine-binding sites in weakly homologous protein models is only 4-9% lower than those aligned using crystal structures. This represents a significant improvement over other algorithms, e.g. the performance of eMatchSite in recognizing similar binding sites is 6% and 13% higher than that of SiteEngine using high- and moderate-quality protein models, respectively. Constructing biologically correct alignments using predicted ligand binding sites in protein models opens up the possibility to investigate drug-protein interaction networks for complete proteomes with prospective systems-level applications in polypharmacology and rational drug repositioning. eMatchSite is freely available to the academic community as a web-server and a stand-alone software distribution at http://www.brylinski.org/ematchsite.
在目标蛋白之间不存在整体同源性的情况下检测配体结合位点之间的相似性,已被公认为现代药物发现的关键组成部分之一。然而,可以使用与序列顺序无关的技术构建局部结合位点比对,为了实现高精度,许多当前用于结合位点比较的算法需要高质量的实验性蛋白质结构,最好是处于结合构象状态的结构。这反过来又使蛋白质组规模的应用变得复杂,因为对于大多数基因产物而言,只有各种质量的结构模型可供使用。为了改进现有技术,我们开发了eMatchSite,这是一种用于在蛋白质模型中构建与序列顺序无关的配体结合位点比对的新方法。使用晶体结构中的腺嘌呤结合口袋进行的大规模基准计算表明,eMatchSite生成的准确比对的蛋白质对数量几乎是SOIPPA的三倍。更重要的是,eMatchSite对蛋白质模型中配体结合区域的结构扭曲具有很高的耐受性。例如,弱同源蛋白质模型中腺嘌呤结合位点正确比对的配对百分比仅比使用晶体结构比对的配对低4-9%。这相对于其他算法有了显著改进,例如,使用高质量和中等质量蛋白质模型时,eMatchSite识别相似结合位点的性能分别比SiteEngine高6%和13%。利用蛋白质模型中预测的配体结合位点构建生物学上正确的比对,为研究完整蛋白质组的药物-蛋白质相互作用网络开辟了可能性,有望在多药理学和合理药物重新定位方面实现系统水平的应用。eMatchSite作为网络服务器和独立软件发行版,可在http://www.brylinski.org/ematchsite上免费提供给学术界。