Electrical Engineering and Computer Science Department, University of Missouri, Columbia, MO, 65211, USA.
Department of Computer Science, Saint Louis University, St. Louis, MO, 63103, USA.
BMC Bioinformatics. 2021 Jan 25;22(1):30. doi: 10.1186/s12859-021-03960-9.
Driven by deep learning, inter-residue contact/distance prediction has been significantly improved and substantially enhanced ab initio protein structure prediction. Currently, most of the distance prediction methods classify inter-residue distances into multiple distance intervals instead of directly predicting real-value distances. The output of the former has to be converted into real-value distances to be used in tertiary structure prediction.
To explore the potentials of predicting real-value inter-residue distances, we develop a multi-task deep learning distance predictor (DeepDist) based on new residual convolutional network architectures to simultaneously predict real-value inter-residue distances and classify them into multiple distance intervals. Tested on 43 CASP13 hard domains, DeepDist achieves comparable performance in real-value distance prediction and multi-class distance prediction. The average mean square error (MSE) of DeepDist's real-value distance prediction is 0.896 Å when filtering out the predicted distance ≥ 16 Å, which is lower than 1.003 Å of DeepDist's multi-class distance prediction. When distance predictions are converted into contact predictions at 8 Å threshold (the standard threshold in the field), the precision of top L/5 and L/2 contact predictions of DeepDist's multi-class distance prediction is 79.3% and 66.1%, respectively, higher than 78.6% and 64.5% of its real-value distance prediction and the best results in the CASP13 experiment.
DeepDist can predict inter-residue distances well and improve binary contact prediction over the existing state-of-the-art methods. Moreover, the predicted real-value distances can be directly used to reconstruct protein tertiary structures better than multi-class distance predictions due to the lower MSE. Finally, we demonstrate that predicting the real-value distance map and multi-class distance map at the same time performs better than predicting real-value distances alone.
受深度学习推动,残基间接触/距离预测得到了显著改善,极大地增强了从头预测蛋白质结构的能力。目前,大多数距离预测方法将残基间距离分为多个距离区间,而不是直接预测实值距离。前者的输出必须转换为实值距离,才能用于三级结构预测。
为了探索预测实值残基间距离的潜力,我们基于新的残差卷积网络架构开发了一种多任务深度学习距离预测器(DeepDist),以同时预测实值残基间距离并将其分为多个距离区间。在 43 个 CASP13 难域上进行测试,DeepDist 在实值距离预测和多类距离预测方面均取得了相当的性能。当过滤掉预测距离≥16 Å 的预测距离时,DeepDist 实值距离预测的平均均方误差(MSE)为 0.896 Å,低于 DeepDist 多类距离预测的 1.003 Å。当距离预测转换为 8 Å 阈值(该领域的标准阈值)下的接触预测时,DeepDist 多类距离预测中前 L/5 和 L/2 接触预测的精度分别为 79.3%和 66.1%,高于其实值距离预测的 78.6%和 64.5%,以及 CASP13 实验中的最佳结果。
DeepDist 可以很好地预测残基间距离,并提高二进制接触预测的性能,优于现有最先进的方法。此外,由于 MSE 较低,预测的实值距离可以直接用于更好地重建蛋白质三级结构,优于多类距离预测。最后,我们证明同时预测实值距离图和多类距离图的性能优于单独预测实值距离。