Suppr超能文献

H-CORE:无需先验知识即可实现生物系统的全基因组规模贝叶斯分析。

H-CORE: enabling genome-scale Bayesian analysis of biological systems without prior knowledge.

作者信息

Jung Sungwon, Lee Kwang H, Lee Doheon

机构信息

Department of Electrical Engineering and Computer Science, KAIST, 373-1 Guseong-dong, Yuseong-gu, Daejeon 305-701, Republic of Korea.

出版信息

Biosystems. 2007 Jul-Aug;90(1):197-210. doi: 10.1016/j.biosystems.2006.08.004. Epub 2006 Aug 22.

Abstract

The Bayesian network is a popular tool for describing relationships between data entities by representing probabilistic (in)dependencies with a directed acyclic graph (DAG) structure. Relationships have been inferred between biological entities using the Bayesian network model with high-throughput data from biological systems in diverse fields. However, the scalability of those approaches is seriously restricted because of the huge search space for finding an optimal DAG structure in the process of Bayesian network learning. For this reason, most previous approaches limit the number of target entities or use additional knowledge to restrict the search space. In this paper, we use the hierarchical clustering and order restriction (H-CORE) method for the learning of large Bayesian networks by clustering entities and restricting edge directions between those clusters, with the aim of overcoming the scalability problem and thus making it possible to perform genome-scale Bayesian network analysis without additional biological knowledge. We use simulations to show that H-CORE is much faster than the widely used sparse candidate method, whilst being of comparable quality. We have also applied H-CORE to retrieving gene-to-gene relationships in a biological system (The 'Rosetta compendium'). By evaluating learned information through literature mining, we demonstrate that H-CORE enables the genome-scale Bayesian analysis of biological systems without any prior knowledge.

摘要

贝叶斯网络是一种流行的工具,通过使用有向无环图(DAG)结构表示概率(非)依赖性来描述数据实体之间的关系。已经使用贝叶斯网络模型,结合来自不同领域生物系统的高通量数据,推断出生物实体之间的关系。然而,由于在贝叶斯网络学习过程中寻找最优DAG结构的搜索空间巨大,这些方法的可扩展性受到严重限制。因此,大多数先前的方法限制了目标实体的数量或使用额外的知识来限制搜索空间。在本文中,我们使用层次聚类和顺序限制(H-CORE)方法来学习大型贝叶斯网络,通过对实体进行聚类并限制这些聚类之间的边方向,旨在克服可扩展性问题,从而在无需额外生物学知识的情况下进行基因组规模的贝叶斯网络分析。我们通过模拟表明,H-CORE比广泛使用的稀疏候选方法快得多,同时质量相当。我们还将H-CORE应用于检索生物系统(“罗塞塔汇编”)中的基因到基因关系。通过文献挖掘评估所学信息,我们证明H-CORE能够在没有任何先验知识的情况下对生物系统进行基因组规模的贝叶斯分析。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验