Zhang Wenxuan, Yang Jianyi, He Baoji, Walker Sara Elizabeth, Zhang Hongjiu, Govindarajoo Brandon, Virtanen Jouko, Xue Zhidong, Shen Hong-Bin, Zhang Yang
Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, 48109.
Department of Biological Chemistry, University of Michigan, Ann Arbor, Michigan, 48109.
Proteins. 2016 Sep;84 Suppl 1(Suppl 1):76-86. doi: 10.1002/prot.24930. Epub 2015 Sep 23.
We tested two pipelines developed for template-free protein structure prediction in the CASP11 experiment. First, the QUARK pipeline constructs structure models by reassembling fragments of continuously distributed lengths excised from unrelated proteins. Five free-modeling (FM) targets have the model successfully constructed by QUARK with a TM-score above 0.4, including the first model of T0837-D1, which has a TM-score = 0.736 and RMSD = 2.9 Å to the native. Detailed analysis showed that the success is partly attributed to the high-resolution contact map prediction derived from fragment-based distance-profiles, which are mainly located between regular secondary structure elements and loops/turns and help guide the orientation of secondary structure assembly. In the Zhang-Server pipeline, weakly scoring threading templates are re-ordered by the structural similarity to the ab initio folding models, which are then reassembled by I-TASSER based structure assembly simulations; 60% more domains with length up to 204 residues, compared to the QUARK pipeline, were successfully modeled by the I-TASSER pipeline with a TM-score above 0.4. The robustness of the I-TASSER pipeline can stem from the composite fragment-assembly simulations that combine structures from both ab initio folding and threading template refinements. Despite the promising cases, challenges still exist in long-range beta-strand folding, domain parsing, and the uncertainty of secondary structure prediction; the latter of which was found to affect nearly all aspects of FM structure predictions, from fragment identification, target classification, structure assembly, to final model selection. Significant efforts are needed to solve these problems before real progress on FM could be made. Proteins 2016; 84(Suppl 1):76-86. © 2015 Wiley Periodicals, Inc.
我们在蛋白质结构预测关键评估(CASP11)实验中测试了两条为无模板蛋白质结构预测而开发的流程。首先,夸克(QUARK)流程通过重新组装从无关蛋白质中切下的不同长度的连续片段来构建结构模型。五个自由建模(FM)目标的模型由夸克成功构建,其TM分数高于0.4,包括T0837-D1的首个模型,该模型与天然结构的TM分数 = 0.736,均方根偏差(RMSD)= 2.9 Å。详细分析表明,成功部分归因于基于片段的距离轮廓得出的高分辨率接触图预测,这些轮廓主要位于规则二级结构元件与环/转角之间,并有助于指导二级结构组装的方向。在张服务器(Zhang-Server)流程中,得分较低的穿线模板通过与从头折叠模型的结构相似性重新排序,然后通过基于I-TASSER的结构组装模拟进行重新组装;与夸克流程相比,I-TASSER流程成功建模了更多(多60%)长度达204个残基的结构域,其TM分数高于0.4。I-TASSER流程的稳健性可能源于复合片段组装模拟,该模拟结合了从头折叠和穿线模板优化的结构。尽管有这些成功案例,但在长程β链折叠、结构域解析以及二级结构预测的不确定性方面仍存在挑战;发现后者几乎影响FM结构预测的所有方面,从片段识别、目标分类、结构组装到最终模型选择。在FM方面取得实际进展之前,需要付出巨大努力来解决这些问题。《蛋白质》2016年;84(增刊1):76 - 86。© 2015威利期刊公司。