Suppr超能文献

PON-All:所有生物的氨基酸替换耐受性预测工具

PON-All: Amino Acid Substitution Tolerance Predictor for All Organisms.

作者信息

Yang Yang, Shao Aibin, Vihinen Mauno

机构信息

School of Computer Science and Technology, Soochow University, Suzhou, China.

Collaborative Innovation Center of Novel Software Technology and Industrialization, Nanjing, China.

出版信息

Front Mol Biosci. 2022 Jun 16;9:867572. doi: 10.3389/fmolb.2022.867572. eCollection 2022.

Abstract

Genetic variations are investigated in human and many other organisms for many purposes (e.g., to aid in clinical diagnosis). Interpretation of the identified variations can be challenging. Although some dedicated prediction methods have been developed and some tools for human variants can also be used for other organisms, the performance and species range have been limited. We developed a novel variant pathogenicity/tolerance predictor for amino acid substitutions in any organism. The method, PON-All, is a machine learning tool trained on human, animal, and plant variants. Two versions are provided, one with Gene Ontology (GO) annotations and another without these details. GO annotations are not available or are partial for many organisms of interest. The methods provide predictions for three classes: pathogenic, benign, and variants of unknown significance. On the blind test, when using GO annotations, accuracy was 0.913 and MCC 0.827. When GO features were not used, accuracy was 0.856 and MCC 0.712. The performance is the best for human and plant variants and somewhat lower for animal variants because the number of known disease-causing variants in animals is rather small. The method was compared to several other tools and was found to have superior performance. PON-All is freely available at http://structure.bmc.lu.se/PON-All and http://8.133.174.28:8999/.

摘要

为了多种目的(例如辅助临床诊断),人们对人类和许多其他生物体中的基因变异进行了研究。对已识别变异的解释可能具有挑战性。尽管已经开发了一些专门的预测方法,并且一些用于人类变异的工具也可用于其他生物体,但性能和物种范围一直有限。我们开发了一种针对任何生物体中氨基酸替换的新型变异致病性/耐受性预测器。该方法名为PON-All,是一种基于人类、动物和植物变异进行训练的机器学习工具。提供了两个版本,一个带有基因本体(GO)注释,另一个没有这些详细信息。对于许多感兴趣的生物体,GO注释不可用或不完整。该方法提供三种分类的预测:致病、良性和意义未明的变异。在盲测中,使用GO注释时,准确率为0.913,马修斯相关系数(MCC)为0.827。不使用GO特征时,准确率为0.856,MCC为0.712。该方法对人类和植物变异的性能最佳,对动物变异的性能略低,因为动物中已知致病变异的数量相当少。该方法与其他几种工具进行了比较,发现具有卓越的性能。PON-All可在http://structure.bmc.lu.se/PON-All和http://8.133.174.28:8999/免费获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c49e/9245922/929243c0d87a/fmolb-09-867572-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验