Suppr超能文献

利用 AlphaFold 进行高精度蛋白质结构预测。

Highly accurate protein structure prediction with AlphaFold.

机构信息

DeepMind, London, UK.

School of Biological Sciences, Seoul National University, Seoul, South Korea.

出版信息

Nature. 2021 Aug;596(7873):583-589. doi: 10.1038/s41586-021-03819-2. Epub 2021 Jul 15.

Abstract

Proteins are essential to life, and understanding their structure can facilitate a mechanistic understanding of their function. Through an enormous experimental effort, the structures of around 100,000 unique proteins have been determined, but this represents a small fraction of the billions of known protein sequences. Structural coverage is bottlenecked by the months to years of painstaking effort required to determine a single protein structure. Accurate computational approaches are needed to address this gap and to enable large-scale structural bioinformatics. Predicting the three-dimensional structure that a protein will adopt based solely on its amino acid sequence-the structure prediction component of the 'protein folding problem'-has been an important open research problem for more than 50 years. Despite recent progress, existing methods fall far short of atomic accuracy, especially when no homologous structure is available. Here we provide the first computational method that can regularly predict protein structures with atomic accuracy even in cases in which no similar structure is known. We validated an entirely redesigned version of our neural network-based model, AlphaFold, in the challenging 14th Critical Assessment of protein Structure Prediction (CASP14), demonstrating accuracy competitive with experimental structures in a majority of cases and greatly outperforming other methods. Underpinning the latest version of AlphaFold is a novel machine learning approach that incorporates physical and biological knowledge about protein structure, leveraging multi-sequence alignments, into the design of the deep learning algorithm.

摘要

蛋白质是生命的基础,了解它们的结构可以帮助我们理解其功能的机制。通过大量的实验努力,已经确定了大约 10 万个独特蛋白质的结构,但这只占已知数十亿种蛋白质序列的一小部分。结构覆盖范围受到确定单个蛋白质结构所需的数月至数年艰苦努力的限制。需要准确的计算方法来解决这一差距,并实现大规模的结构生物信息学。仅根据其氨基酸序列预测蛋白质将采用的三维结构——“蛋白质折叠问题”的结构预测部分——是 50 多年来一个重要的开放研究问题。尽管最近取得了进展,但现有的方法远远达不到原子精度,尤其是在没有同源结构的情况下。在这里,我们提供了第一个可以定期预测蛋白质结构的计算方法,即使在没有类似结构的情况下,也可以达到原子精度。我们在具有挑战性的第十四届蛋白质结构预测关键评估(CASP14)中验证了我们基于神经网络的模型 AlphaFold 的全新设计版本,在大多数情况下,其准确性可与实验结构相媲美,并且大大优于其他方法。最新版本的 AlphaFold 的基础是一种新颖的机器学习方法,它将关于蛋白质结构的物理和生物学知识纳入深度学习算法的设计中,利用多序列比对。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5c1f/8387230/d61217a1e325/41586_2021_3819_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验