Suppr超能文献

通过在基因本体上进行向下随机游走预测蛋白质功能。

Predicting protein function via downward random walks on a gene ontology.

作者信息

Yu Guoxian, Zhu Hailong, Domeniconi Carlotta, Liu Jiming

机构信息

College of Computer and Information Sciences, Southwest University, Beibei, Chongqing, China.

Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, China.

出版信息

BMC Bioinformatics. 2015 Aug 27;16:271. doi: 10.1186/s12859-015-0713-y.

Abstract

BACKGROUND

High-throughput bio-techniques accumulate ever-increasing amount of genomic and proteomic data. These data are far from being functionally characterized, despite the advances in gene (or gene's product proteins) functional annotations. Due to experimental techniques and to the research bias in biology, the regularly updated functional annotation databases, i.e., the Gene Ontology (GO), are far from being complete. Given the importance of protein functions for biological studies and drug design, proteins should be more comprehensively and precisely annotated.

RESULTS

We proposed downward Random Walks (dRW) to predict missing (or new) functions of partially annotated proteins. Particularly, we apply downward random walks with restart on the GO directed acyclic graph, along with the available functions of a protein, to estimate the probability of missing functions. To further boost the prediction accuracy, we extend dRW to dRW-kNN. dRW-kNN computes the semantic similarity between proteins based on the functional annotations of proteins; it then predicts functions based on the functions estimated by dRW, together with the functions associated with the k nearest proteins. Our proposed models can predict two kinds of missing functions: (i) the ones that are missing for a protein but associated with other proteins of interest; (ii) the ones that are not available for any protein of interest, but exist in the GO hierarchy. Experimental results on the proteins of Yeast and Human show that dRW and dRW-kNN can replenish functions more accurately than other related approaches, especially for sparse functions associated with no more than 10 proteins.

CONCLUSION

The empirical study shows that the semantic similarity between GO terms and the ontology hierarchy play important roles in predicting protein function. The proposed dRW and dRW-kNN can serve as tools for replenishing functions of partially annotated proteins.

摘要

背景

高通量生物技术积累了数量不断增加的基因组和蛋白质组数据。尽管在基因(或基因的产物蛋白质)功能注释方面取得了进展,但这些数据的功能特征仍远未明确。由于实验技术和生物学研究的偏差,定期更新的功能注释数据库,即基因本体论(GO),远未完善。鉴于蛋白质功能在生物学研究和药物设计中的重要性,蛋白质应得到更全面、精确的注释。

结果

我们提出了向下随机游走(dRW)来预测部分注释蛋白质的缺失(或新的)功能。具体而言,我们在GO有向无环图上应用带重启的向下随机游走,并结合蛋白质的已知功能,来估计缺失功能的概率。为进一步提高预测准确性,我们将dRW扩展为dRW-kNN。dRW-kNN基于蛋白质的功能注释计算蛋白质之间的语义相似性;然后根据dRW估计的功能以及与k个最近蛋白质相关的功能来预测功能。我们提出的模型可以预测两种缺失功能:(i)蛋白质缺失但与其他感兴趣蛋白质相关的功能;(ii)任何感兴趣蛋白质都不具备但存在于GO层次结构中的功能。酵母和人类蛋白质的实验结果表明,dRW和dRW-kNN比其他相关方法能更准确地补充功能,特别是对于与不超过10种蛋白质相关的稀疏功能。

结论

实证研究表明,GO术语之间的语义相似性和本体层次结构在预测蛋白质功能中起着重要作用。所提出的dRW和dRW-kNN可作为补充部分注释蛋白质功能的工具。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/63bc/4551531/e4c8436b01f3/12859_2015_713_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验