Ritchie David W, Kozakov Dima, Vajda Sandor
Department of Computing Science, University of Aberdeen, Aberdeen, Scotland, UK.
Bioinformatics. 2008 Sep 1;24(17):1865-73. doi: 10.1093/bioinformatics/btn334. Epub 2008 Jun 30.
Predicting how proteins interact at the molecular level is a computationally intensive task. Many protein docking algorithms begin by using fast Fourier transform (FFT) correlation techniques to find putative rigid body docking orientations. Most such approaches use 3D Cartesian grids and are therefore limited to computing three dimensional (3D) translational correlations. However, translational FFTs can speed up the calculation in only three of the six rigid body degrees of freedom, and they cannot easily incorporate prior knowledge about a complex to focus and hence further accelerate the calculation. Furthemore, several groups have developed multi-term interaction potentials and others use multi-copy approaches to simulate protein flexibility, which both add to the computational cost of FFT-based docking algorithms. Hence there is a need to develop more powerful and more versatile FFT docking techniques.
This article presents a closed-form 6D spherical polar Fourier correlation expression from which arbitrary multi-dimensional multi-property multi-resolution FFT correlations may be generated. The approach is demonstrated by calculating 1D, 3D and 5D rotational correlations of 3D shape and electrostatic expansions up to polynomial order L=30 on a 2 GB personal computer. As expected, 3D correlations are found to be considerably faster than 1D correlations but, surprisingly, 5D correlations are often slower than 3D correlations. Nonetheless, we show that 5D correlations will be advantageous when calculating multi-term knowledge-based interaction potentials. When docking the 84 complexes of the Protein Docking Benchmark, blind 3D shape plus electrostatic correlations take around 30 minutes on a contemporary personal computer and find acceptable solutions within the top 20 in 16 cases. Applying a simple angular constraint to focus the calculation around the receptor binding site produces acceptable solutions within the top 20 in 28 cases. Further constraining the search to the ligand binding site gives up to 48 solutions within the top 20, with calculation times of just a few minutes per complex. Hence the approach described provides a practical and fast tool for rigid body protein-protein docking, especially when prior knowledge about one or both binding sites is available.
预测蛋白质在分子水平上如何相互作用是一项计算量很大的任务。许多蛋白质对接算法首先使用快速傅里叶变换(FFT)相关技术来寻找假定的刚体对接方向。大多数此类方法使用三维笛卡尔网格,因此仅限于计算三维(3D)平移相关性。然而,平移FFT只能加快六个刚体自由度中三个自由度的计算,并且它们不能轻易纳入关于复合物的先验知识以聚焦并因此进一步加速计算。此外,有几个研究小组开发了多参数相互作用势,还有一些小组使用多拷贝方法来模拟蛋白质的灵活性,这两者都增加了基于FFT的对接算法的计算成本。因此,需要开发更强大、更通用的FFT对接技术。
本文提出了一种封闭形式的六维球极傅里叶相关表达式,从中可以生成任意多维、多属性、多分辨率的FFT相关性。在一台2GB的个人计算机上,通过计算三维形状和静电展开的一维、三维和五维旋转相关性(多项式阶数L = 30)来演示该方法。正如预期的那样,发现三维相关性比一维相关性快得多,但令人惊讶的是,五维相关性通常比三维相关性慢。尽管如此,我们表明在计算基于多参数知识的相互作用势时,五维相关性将具有优势。对接蛋白质对接基准测试的84个复合物时,在当代个人计算机上,盲目的三维形状加静电相关性大约需要30分钟,并且在16个案例中在前20名中找到了可接受的解决方案。应用简单的角度约束以将计算聚焦在受体结合位点周围,在28个案例中在前20名中产生了可接受的解决方案。进一步将搜索限制在配体结合位点,在前20名中最多可得到48个解决方案,每个复合物的计算时间仅为几分钟。因此,所描述的方法为刚体蛋白质 - 蛋白质对接提供了一种实用且快速的工具,特别是当关于一个或两个结合位点的先验知识可用时。