通过将深度学习接触图与 I-TASSER 组装模拟相结合来折叠非同源蛋白质。

Folding non-homologous proteins by coupling deep-learning contact maps with I-TASSER assembly simulations.

机构信息

Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA.

These authors contributed equally.

出版信息

Cell Rep Methods. 2021 Jul 26;1(3). doi: 10.1016/j.crmeth.2021.100014. Epub 2021 Jun 21.

DOI:10.1016/j.crmeth.2021.100014

PMID:34355210

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8336924/

Abstract

Structure prediction for proteins lacking homologous templates in the Protein Data Bank (PDB) remains a significant unsolved problem. We developed a protocol, C-I-TASSER, to integrate interresidue contact maps from deep neural-network learning with the cutting-edge I-TASSER fragment assembly simulations. Large-scale benchmark tests showed that C-I-TASSER can fold more than twice the number of non-homologous proteins than the I-TASSER, which does not use contacts. When applied to a folding experiment on 8,266 unsolved Pfam families, C-I-TASSER successfully folded 4,162 domain families, including 504 folds that are not found in the PDB. Furthermore, it created correct folds for 85% of proteins in the SARS-CoV-2 genome, despite the quick mutation rate of the virus and sparse sequence profiles. The results demonstrated the critical importance of coupling whole-genome and metagenome-based evolutionary information with optimal structure assembly simulations for solving the problem of non-homologous protein structure prediction.

摘要

在蛋白质数据库 (PDB) 中缺乏同源模板的蛋白质结构预测仍然是一个未解决的重大问题。我们开发了一种名为 C-I-TASSER 的协议，将来自深度神经网络学习的残基间接触图与最先进的 I-TASSER 片段组装模拟相结合。大规模基准测试表明，C-I-TASSER 可以折叠比不使用接触信息的 I-TASSER 多两倍的非同源蛋白质。当应用于 8266 个未解决的 Pfam 家族的折叠实验时，C-I-TASSER 成功折叠了 4162 个结构域家族，其中包括 504 个在 PDB 中未发现的折叠。此外，它为 SARS-CoV-2 基因组中的 85%的蛋白质创建了正确的折叠，尽管病毒的快速突变率和稀疏的序列特征。这些结果表明，将基于全基因组和宏基因组的进化信息与最佳结构组装模拟相结合对于解决非同源蛋白质结构预测问题至关重要。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aa87/9017242/304ba67c2d6d/fx1.jpg

相似文献

Folding non-homologous proteins by coupling deep-learning contact maps with I-TASSER assembly simulations.

Cell Rep Methods. 2021 Jul 26;1(3). doi: 10.1016/j.crmeth.2021.100014. Epub 2021 Jun 21.

Deep-learning contact-map guided protein structure prediction in CASP13.

Proteins. 2019 Dec;87(12):1149-1164. doi: 10.1002/prot.25792. Epub 2019 Aug 14.

Ab initio modeling of small proteins by iterative TASSER simulations.

BMC Biol. 2007 May 8;5:17. doi: 10.1186/1741-7007-5-17.

I-TASSER-MTD: a deep-learning-based platform for multi-domain protein structure and function prediction.

Nat Protoc. 2022 Oct;17(10):2326-2353. doi: 10.1038/s41596-022-00728-0. Epub 2022 Aug 5.

Template-based and free modeling of I-TASSER and QUARK pipelines using predicted contact maps in CASP12.

Proteins. 2018 Mar;86 Suppl 1(Suppl 1):136-151. doi: 10.1002/prot.25414. Epub 2017 Nov 14.

NMR data-driven structure determination using NMR-I-TASSER in the CASD-NMR experiment.

J Biomol NMR. 2015 Aug;62(4):511-25. doi: 10.1007/s10858-015-9914-y. Epub 2015 Mar 4.

Integration of QUARK and I-TASSER for Ab Initio Protein Structure Prediction in CASP11.

Proteins. 2016 Sep;84 Suppl 1(Suppl 1):76-86. doi: 10.1002/prot.24930. Epub 2015 Sep 23.

Improving fragment-based ab initio protein structure assembly using low-accuracy contact-map predictions.

Nat Commun. 2021 Aug 18;12(1):5011. doi: 10.1038/s41467-021-25316-w.

Protein structure prediction using deep learning distance and hydrogen-bonding restraints in CASP14.

Proteins. 2021 Dec;89(12):1734-1751. doi: 10.1002/prot.26193. Epub 2021 Aug 7.

Accurate De Novo Prediction of Protein Contact Map by Ultra-Deep Learning Model.

PLoS Comput Biol. 2017 Jan 5;13(1):e1005324. doi: 10.1371/journal.pcbi.1005324. eCollection 2017 Jan.

引用本文的文献

Genome-wide identification, characterization and evolutionary analysis of the pyrroline-5-carboxylate synthetase (P5CS), succinic semialdehyde dehydrogenase (SSADH), and dehydrin (DHN) genes in Solanum lycopersicum under drought stress.

BMC Plant Biol. 2025 Aug 9;25(1):1060. doi: 10.1186/s12870-025-07057-w.

A cryptic START domain regulates deeply conserved transcription factors.

bioRxiv. 2025 Aug 1:2025.07.29.667167. doi: 10.1101/2025.07.29.667167.

Phenotypic dichotomy in Crotalus durissus ruruima venom and potential consequences for clinical management of snakebite envenomations.

PLoS Negl Trop Dis. 2025 Aug 1;19(8):e0013296. doi: 10.1371/journal.pntd.0013296. eCollection 2025 Aug.

Chemosensory Receptors in Vertebrates: Structure and Computational Modeling Insights.

Int J Mol Sci. 2025 Jul 10;26(14):6605. doi: 10.3390/ijms26146605.

Search continues: Exploring immunoinformatics platforms for designing an effective multiepitope malaria vaccine candidate.

BioTechnologia (Pozn). 2025 Jun 30;106(2):151-168. doi: 10.5114/bta/204528. eCollection 2025.

Impact of SARS-CoV-2 Variant NSP6 on Pathogenicity: Genetic Analysis and Cell Biology.

Curr Issues Mol Biol. 2025 May 14;47(5):361. doi: 10.3390/cimb47050361.

Bioinformatic Analysis of WNT Family Proteins.

Bioinform Biol Insights. 2025 Jul 15;19:11779322251353347. doi: 10.1177/11779322251353347. eCollection 2025.

The Structural Basis of Binding Stability and Selectivity of Sarolaner Enantiomers for RDL Receptors.

Molecules. 2025 Jun 26;30(13):2756. doi: 10.3390/molecules30132756.

Artificial Intelligence (AI)-Based Protein Structure Prediction and Analysis.

Methods Mol Biol. 2025;2952:39-57. doi: 10.1007/978-1-0716-4690-8_3.

Predicted conformations of 5-HT3 receptor ion channels are modified by subunit D.

Comput Struct Biotechnol J. 2025 May 29;27:2394-2402. doi: 10.1016/j.csbj.2025.05.048. eCollection 2025.

本文引用的文献

Deducing high-accuracy protein contact-maps from a triplet of coevolutionary matrices through deep residual convolutional networks.

PLoS Comput Biol. 2021 Mar 26;17(3):e1008865. doi: 10.1371/journal.pcbi.1008865. eCollection 2021 Mar.

Virtual Screening of Human Class-A GPCRs Using Ligand Profiles Built on Multiple Ligand-Receptor Interactions.

J Mol Biol. 2020 Aug 7;432(17):4872-4890. doi: 10.1016/j.jmb.2020.07.003. Epub 2020 Jul 9.

FASPR: an open-source tool for fast and accurate protein side-chain packing.

Bioinformatics. 2020 Jun 1;36(12):3758-3765. doi: 10.1093/bioinformatics/btaa234.

Improved protein structure prediction using potentials from deep learning.

Nature. 2020 Jan;577(7792):706-710. doi: 10.1038/s41586-019-1923-7. Epub 2020 Jan 15.

Improved protein structure prediction using predicted interresidue orientations.

Proc Natl Acad Sci U S A. 2020 Jan 21;117(3):1496-1503. doi: 10.1073/pnas.1914677117. Epub 2020 Jan 2.

DeepMSA: constructing deep multiple sequence alignment to improve contact prediction and fold-recognition for distant-homology proteins.

Bioinformatics. 2020 Apr 1;36(7):2105-2112. doi: 10.1093/bioinformatics/btz863.

Detecting distant-homology protein structures by aligning deep neural-network based contact maps.

PLoS Comput Biol. 2019 Oct 17;15(10):e1007411. doi: 10.1371/journal.pcbi.1007411. eCollection 2019 Oct.

Critical assessment of methods of protein structure prediction (CASP)-Round XIII.

Proteins. 2019 Dec;87(12):1011-1020. doi: 10.1002/prot.25823. Epub 2019 Oct 23.

Assessing the accuracy of contact predictions in CASP13.

Proteins. 2019 Dec;87(12):1058-1068. doi: 10.1002/prot.25819. Epub 2019 Oct 24.

HH-suite3 for fast remote homology detection and deep protein annotation.

BMC Bioinformatics. 2019 Sep 14;20(1):473. doi: 10.1186/s12859-019-3019-7.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

通过将深度学习接触图与 I-TASSER 组装模拟相结合来折叠非同源蛋白质。

Folding non-homologous proteins by coupling deep-learning contact maps with I-TASSER assembly simulations.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献