Suppr超能文献

一种用于从人类群体序列数据中估计非同义变体选择系数的概率图模型。

A probabilistic graphical model for estimating selection coefficients of nonsynonymous variants from human population sequence data.

作者信息

Zhao Yige, Lan Tian, Zhong Guojie, Hagen Jake, Pan Hongbing, Chung Wendy K, Shen Yufeng

机构信息

Department of Systems Biology, Columbia University Irving Medical Center, New York, NY, USA.

The Integrated Program in Cellular, Molecular, and Biomedical Studies, Columbia University, New York, NY, USA.

出版信息

Nat Commun. 2025 May 20;16(1):4670. doi: 10.1038/s41467-025-59937-2.

Abstract

Accurately predicting the effect of missense variants is important in discovering disease risk genes and clinical genetic diagnostics. Commonly used computational methods predict pathogenicity, which does not capture the quantitative impact on fitness in humans. We develop a method, MisFit, to estimate missense fitness effect using a graphical model. MisFit jointly models the effect at a molecular level ( ) and a population level (selection coefficient, ), assuming that in the same gene, missense variants with similar have similar . We train it by maximizing probability of observed allele counts in 236,017 individuals of European ancestry. We show that is informative in predicting allele frequency across ancestries and consistent with the fraction of de novo mutations in sites under strong selection. Further, outperforms previous methods in prioritizing de novo missense variants in individuals with neurodevelopmental disorders. In conclusion, MisFit accurately predicts and yields new insights from genomic data.

摘要

准确预测错义变异的影响对于发现疾病风险基因和临床基因诊断至关重要。常用的计算方法预测致病性,但无法捕捉其对人类适应性的定量影响。我们开发了一种名为MisFit的方法,使用图形模型来估计错义变异的适应性效应。MisFit在分子水平( )和群体水平(选择系数, )上联合建模效应,假设在同一基因中,具有相似 的错义变异具有相似的 。我们通过最大化236,017名欧洲血统个体中观察到的等位基因计数的概率来训练它。我们表明, 在预测不同血统的等位基因频率方面具有信息价值,并且与强选择位点的新生突变比例一致。此外,在对神经发育障碍个体中的新生错义变异进行优先级排序时,MisFit优于先前的方法。总之,MisFit准确预测 并从基因组数据中获得新的见解。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验