Suppr超能文献

预测氨基酸替换和缺失的功能效应。

Predicting the functional effect of amino acid substitutions and indels.

机构信息

The J. Craig Venter Institute, Rockville, Maryland, United States of America.

出版信息

PLoS One. 2012;7(10):e46688. doi: 10.1371/journal.pone.0046688. Epub 2012 Oct 8.

Abstract

As next-generation sequencing projects generate massive genome-wide sequence variation data, bioinformatics tools are being developed to provide computational predictions on the functional effects of sequence variations and narrow down the search of casual variants for disease phenotypes. Different classes of sequence variations at the nucleotide level are involved in human diseases, including substitutions, insertions, deletions, frameshifts, and non-sense mutations. Frameshifts and non-sense mutations are likely to cause a negative effect on protein function. Existing prediction tools primarily focus on studying the deleterious effects of single amino acid substitutions through examining amino acid conservation at the position of interest among related sequences, an approach that is not directly applicable to insertions or deletions. Here, we introduce a versatile alignment-based score as a new metric to predict the damaging effects of variations not limited to single amino acid substitutions but also in-frame insertions, deletions, and multiple amino acid substitutions. This alignment-based score measures the change in sequence similarity of a query sequence to a protein sequence homolog before and after the introduction of an amino acid variation to the query sequence. Our results showed that the scoring scheme performs well in separating disease-associated variants (n = 21,662) from common polymorphisms (n = 37,022) for UniProt human protein variations, and also in separating deleterious variants (n = 15,179) from neutral variants (n = 17,891) for UniProt non-human protein variations. In our approach, the area under the receiver operating characteristic curve (AUC) for the human and non-human protein variation datasets is ∼0.85. We also observed that the alignment-based score correlates with the deleteriousness of a sequence variation. In summary, we have developed a new algorithm, PROVEAN (Protein Variation Effect Analyzer), which provides a generalized approach to predict the functional effects of protein sequence variations including single or multiple amino acid substitutions, and in-frame insertions and deletions. The PROVEAN tool is available online at http://provean.jcvi.org.

摘要

随着下一代测序项目生成大量全基因组序列变异数据,生物信息学工具也在不断发展,以提供对序列变异功能影响的计算预测,并缩小对疾病表型的因果变异的搜索范围。核苷酸水平的不同类别序列变异与人类疾病有关,包括替换、插入、缺失、移码和无义突变。移码和无义突变可能对蛋白质功能产生负面影响。现有的预测工具主要通过检查感兴趣位置在相关序列中的氨基酸保守性来研究单个氨基酸替换的有害影响,这种方法不适用于插入或缺失。在这里,我们引入了一种通用的基于比对的评分作为一种新的度量标准,用于预测不仅限于单个氨基酸替换,还包括框内插入、缺失和多个氨基酸替换的变异的破坏性影响。这种基于比对的评分衡量了在向查询序列中引入氨基酸变异前后,查询序列与蛋白质序列同源物的序列相似性变化。我们的结果表明,该评分方案在区分与疾病相关的变异(n=21662)和常见多态性(n=37022)时,在区分UniProt 人类蛋白质变异的有害变异(n=15179)和中性变异(n=17891)时,表现良好。在我们的方法中,人类和非人类蛋白质变异数据集的接收器操作特征曲线(AUC)下面积约为 0.85。我们还观察到,基于比对的评分与序列变异的有害性相关。总之,我们开发了一种新的算法 PROVEAN(蛋白质变异效应分析器),它提供了一种通用的方法来预测蛋白质序列变异的功能影响,包括单个或多个氨基酸替换以及框内插入和缺失。PROVEAN 工具可在 http://provean.jcvi.org 上获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/12fa/3466303/9d74e4791508/pone.0046688.g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验