Suppr超能文献

利用精细定位的遗传数据进行孟德尔随机化:从大量相关工具变量中进行选择。

Mendelian randomization with fine-mapped genetic data: Choosing from large numbers of correlated instrumental variables.

作者信息

Burgess Stephen, Zuber Verena, Valdes-Marquez Elsa, Sun Benjamin B, Hopewell Jemma C

机构信息

MRC Biostatistics Unit, Cambridge, United Kingdom.

Department of Public Health and Primary Care, University of Cambridge, Cambridge, United Kingdom.

出版信息

Genet Epidemiol. 2017 Dec;41(8):714-725. doi: 10.1002/gepi.22077. Epub 2017 Sep 25.

Abstract

Mendelian randomization uses genetic variants to make causal inferences about the effect of a risk factor on an outcome. With fine-mapped genetic data, there may be hundreds of genetic variants in a single gene region any of which could be used to assess this causal relationship. However, using too many genetic variants in the analysis can lead to spurious estimates and inflated Type 1 error rates. But if only a few genetic variants are used, then the majority of the data is ignored and estimates are highly sensitive to the particular choice of variants. We propose an approach based on summarized data only (genetic association and correlation estimates) that uses principal components analysis to form instruments. This approach has desirable theoretical properties: it takes the totality of data into account and does not suffer from numerical instabilities. It also has good properties in simulation studies: it is not particularly sensitive to varying the genetic variants included in the analysis or the genetic correlation matrix, and it does not have greatly inflated Type 1 error rates. Overall, the method gives estimates that are less precise than those from variable selection approaches (such as using a conditional analysis or pruning approach to select variants), but are more robust to seemingly arbitrary choices in the variable selection step. Methods are illustrated by an example using genetic associations with testosterone for 320 genetic variants to assess the effect of sex hormone related pathways on coronary artery disease risk, in which variable selection approaches give inconsistent inferences.

摘要

孟德尔随机化利用基因变异对风险因素对结局的影响进行因果推断。对于精细定位的基因数据,单个基因区域可能存在数百个基因变异,其中任何一个都可用于评估这种因果关系。然而,在分析中使用过多的基因变异可能会导致虚假估计和第一类错误率膨胀。但如果只使用少数基因变异,那么大部分数据就会被忽略,估计结果对变异的特定选择高度敏感。我们提出一种仅基于汇总数据(基因关联和相关性估计)的方法,该方法使用主成分分析来构建工具变量。这种方法具有理想的理论特性:它考虑了数据的整体情况,并且不存在数值不稳定性。在模拟研究中它也具有良好的特性:它对分析中所包含的基因变异或基因相关矩阵的变化不太敏感,并且第一类错误率不会大幅膨胀。总体而言,该方法给出的估计结果不如变量选择方法(如使用条件分析或剪枝方法来选择变异)得到的结果精确,但对于变量选择步骤中看似随意的选择更为稳健。通过一个例子来说明这些方法,该例子使用了320个基因变异与睾酮的基因关联来评估性激素相关通路对冠状动脉疾病风险的影响,在这个例子中,变量选择方法给出了不一致的推断。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f047/5725678/a20997b3e18e/GEPI-41-714-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验