Suppr超能文献

选择变异掩码以提高基因水平负担测试的效能和可重复性。

Selecting variant masks to improve power and replicability of gene-level burden tests.

作者信息

Nguyen Trang, Koesterer Ryan, Jurgens Sean J, Dornbos Peter, Yoshiji Satoshi, Llamas Alex, Jang Dongkeun, Smadbeck Patrick, Moriondo Annie, Hoang Quy, Ruebenacker Oliver, Ellinor Patrick, Burtt Noël, Flannick Jason

机构信息

Program in Medical & Population Genetics, The Broad Institute of MIT and Harvard, Cambridge, MA, USA.

Division of Genetics and Genomics, Boston Children's Hospital, Boston, MA, USA.

出版信息

Res Sq. 2025 Apr 15:rs.3.rs-6322956. doi: 10.21203/rs.3.rs-6322956/v1.

Abstract

Rare coding variant association studies typically perform gene-level association tests in which variants are filtered (or "masked") and aggregated based on functional annotation and allele frequency. As there is little research and no consensus regarding masking strategies to use, we investigated the impact of masking strategies on gene-level burden tests, the most widely used and interpretable type of aggregate association test. A systematic review of 234 studies catalogued 664 masks and masking strategies that rarely repeated across studies. Analyzing 54 traits within 189,947 UK Biobank exomes, we show that the number of significant associations greatly depends on the masking strategy employed (ranging from 58 to 2,523 associations) and, consequently, separate published analyses of this dataset report minimally overlapping associations (<30%). By empirically determining mask combinations that maximize the number of significant associations, we propose masking strategies that detect twice as many significant low-frequency and rare variant associations as the "average" strategies previously employed, with consistent performance across many traits. Our analyses demonstrate the inconsistency of previously used variant masking strategies and provide a simple solution to increase power and replicability in future studies.

摘要

罕见编码变异关联研究通常进行基因水平的关联测试,其中变异会根据功能注释和等位基因频率进行过滤(或“屏蔽”)并汇总。由于关于使用何种屏蔽策略的研究很少且没有共识,我们研究了屏蔽策略对基因水平负担测试的影响,这是最广泛使用且可解释的汇总关联测试类型。对234项研究的系统综述列出了664种屏蔽和屏蔽策略,这些策略在不同研究中很少重复。通过分析英国生物银行189,947个外显子组中的54个性状,我们发现显著关联的数量很大程度上取决于所采用的屏蔽策略(范围从58到2523个关联),因此,对该数据集单独发表的分析报告显示重叠关联极少(<30%)。通过实证确定能使显著关联数量最大化的屏蔽组合,我们提出了一些屏蔽策略,这些策略检测到的低频和罕见变异显著关联数量是之前使用的“平均”策略的两倍,并且在许多性状上具有一致的表现。我们的分析证明了之前使用的变异屏蔽策略的不一致性,并为在未来研究中提高检验效能和可重复性提供了一个简单的解决方案。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7613/12047983/06db97e96337/nihpp-rs6322956v1-f0001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验