Suppr超能文献

一种用于整合先验生物学知识以进行全局网络学习的增强型高维图形套索方法。

An Augmented High-Dimensional Graphical Lasso Method to Incorporate Prior Biological Knowledge for Global Network Learning.

作者信息

Zhuang Yonghua, Xing Fuyong, Ghosh Debashis, Banaei-Kashani Farnoush, Bowler Russell P, Kechris Katerina

机构信息

Department of Biostatistics and Informatics, University of Colorado Anschutz Medical Campus, Aurora, CO, United States.

Department of Computer Science and Engineering, University of Colorado Denver, Denver, CO, United States.

出版信息

Front Genet. 2022 Jan 27;12:760299. doi: 10.3389/fgene.2021.760299. eCollection 2021.

Abstract

Biological networks are often inferred through Gaussian graphical models (GGMs) using gene or protein expression data only. GGMs identify conditional dependence by estimating a precision matrix between genes or proteins. However, conventional GGM approaches often ignore prior knowledge about protein-protein interactions (PPI). Recently, several groups have extended GGM to weighted graphical Lasso (wGlasso) and network-based gene set analysis (Netgsa) and have demonstrated the advantages of incorporating PPI information. However, these methods are either computationally intractable for large-scale data, or disregard weights in the PPI networks. To address these shortcomings, we extended the Netgsa approach and developed an augmented high-dimensional graphical Lasso (AhGlasso) method to incorporate edge weights in known PPI with omics data for global network learning. This new method outperforms weighted graphical Lasso-based algorithms with respect to computational time in simulated large-scale data settings while achieving better or comparable prediction accuracy of node connections. The total runtime of AhGlasso is approximately five times faster than weighted Glasso methods when the graph size ranges from 1,000 to 3,000 with a fixed sample size ( = 300). The runtime difference between AhGlasso and weighted Glasso increases when the graph size increases. Using proteomic data from a study on chronic obstructive pulmonary disease, we demonstrate that AhGlasso improves protein network inference compared to the Netgsa approach by incorporating PPI information.

摘要

生物网络通常仅通过使用基因或蛋白质表达数据的高斯图形模型(GGM)来推断。GGM通过估计基因或蛋白质之间的精度矩阵来识别条件依赖性。然而,传统的GGM方法通常忽略关于蛋白质-蛋白质相互作用(PPI)的先验知识。最近,几个研究小组已将GGM扩展到加权图形套索(wGlasso)和基于网络的基因集分析(Netgsa),并证明了纳入PPI信息的优势。然而,这些方法要么对于大规模数据在计算上难以处理,要么忽略PPI网络中的权重。为了解决这些缺点,我们扩展了Netgsa方法,并开发了一种增强的高维图形套索(AhGlasso)方法,将已知PPI中的边权重与组学数据相结合用于全局网络学习。在模拟的大规模数据设置中,这种新方法在计算时间方面优于基于加权图形套索的算法,同时在节点连接的预测准确性方面达到更好或相当的水平。当图形大小在固定样本量(=300)下从1000到3000变化时,AhGlasso的总运行时间比加权套索方法快约五倍。当图形大小增加时,AhGlasso和加权套索之间的运行时间差异会增大。使用来自慢性阻塞性肺疾病研究的蛋白质组学数据,我们证明AhGlasso通过纳入PPI信息比Netgsa方法改进了蛋白质网络推断。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5419/8829118/c520834c9fee/fgene-12-760299-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验