Suppr超能文献

LEON-BIS:使用贝叶斯推理系统对序列邻域进行多重比对评估。

LEON-BIS: multiple alignment evaluation of sequence neighbours using a Bayesian inference system.

作者信息

Vanhoutreve Renaud, Kress Arnaud, Legrand Baptiste, Gass Hélène, Poch Olivier, Thompson Julie D

机构信息

Department of Computer Science, ICube, UMR 7357, University of Strasbourg, CNRS, Fédération de médecine translationnelle de Strasbourg, Strasbourg, France.

出版信息

BMC Bioinformatics. 2016 Jul 7;17(1):271. doi: 10.1186/s12859-016-1146-y.

Abstract

BACKGROUND

A standard procedure in many areas of bioinformatics is to use a multiple sequence alignment (MSA) as the basis for various types of homology-based inference. Applications include 3D structure modelling, protein functional annotation, prediction of molecular interactions, etc. These applications, however sophisticated, are generally highly sensitive to the alignment used, and neglecting non-homologous or uncertain regions in the alignment can lead to significant bias in the subsequent inferences.

RESULTS

Here, we present a new method, LEON-BIS, which uses a robust Bayesian framework to estimate the homologous relations between sequences in a protein multiple alignment. Sequences are clustered into sub-families and relations are predicted at different levels, including 'core blocks', 'regions' and full-length proteins. The accuracy and reliability of the predictions are demonstrated in large-scale comparisons using well annotated alignment databases, where the homologous sequence segments are detected with very high sensitivity and specificity.

CONCLUSIONS

LEON-BIS uses robust Bayesian statistics to distinguish the portions of multiple sequence alignments that are conserved either across the whole family or within subfamilies. LEON-BIS should thus be useful for automatic, high-throughput genome annotations, 2D/3D structure predictions, protein-protein interaction predictions etc.

摘要

背景

在生物信息学的许多领域,一个标准程序是使用多序列比对(MSA)作为各种基于同源性推断的基础。应用包括三维结构建模、蛋白质功能注释、分子相互作用预测等。然而,这些应用无论多么复杂,通常都对所使用的比对高度敏感,并且忽略比对中的非同源或不确定区域可能会导致后续推断出现重大偏差。

结果

在此,我们提出一种新方法LEON - BIS,它使用稳健的贝叶斯框架来估计蛋白质多序列比对中序列之间的同源关系。序列被聚类成亚家族,并在不同层次上预测关系,包括“核心区域”、“区域”和全长蛋白质。在使用注释良好的比对数据库进行的大规模比较中证明了预测的准确性和可靠性,其中同源序列片段以非常高的灵敏度和特异性被检测到。

结论

LEON - BIS使用稳健的贝叶斯统计来区分在整个家族或亚家族内保守的多序列比对部分。因此,LEON - BIS对于自动、高通量的基因组注释、二维/三维结构预测、蛋白质 - 蛋白质相互作用预测等应该是有用的。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/815e/4936259/1b2efca6e30e/12859_2016_1146_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验