Shanxi Medical University.
Hebei Medical University.
Brief Bioinform. 2021 May 20;22(3). doi: 10.1093/bib/bbaa113.
Mediation analysis has been a useful tool for investigating the effect of mediators that lie in the path from the independent variable to the outcome. With the increasing dimensionality of mediators such as in (epi)genomics studies, high-dimensional mediation model is needed. In this work, we focus on epigenetic studies with the goal to identify important DNA methylations that act as mediators between an exposure disease outcome. Specifically, we focus on gene-based high-dimensional mediation analysis implemented with kernel principal component analysis to capture potential nonlinear mediation effect. We first review the current high-dimensional mediation models and then propose two gene-based analytical approaches: gene-based high-dimensional mediation analysis based on linearity assumption between mediators and outcome (gHMA-L) and gene-based high-dimensional mediation analysis based on nonlinearity assumption (gHMA-NL). Since the underlying true mediation relationship is unknown in practice, we further propose an omnibus test of gene-based high-dimensional mediation analysis (gHMA-O) by combing gHMA-L and gHMA-NL. Extensive simulation studies show that gHMA-L performs better under the model linear assumption and gHMA-NL does better under the model nonlinear assumption, while gHMA-O is a more powerful and robust method by combining the two. We apply the proposed methods to two datasets to investigate genes whose methylation levels act as important mediators in the relationship: (1) between alcohol consumption and epithelial ovarian cancer risk using data from the Mayo Clinic Ovarian Cancer Case-Control Study and (2) between childhood maltreatment and comorbid post-traumatic stress disorder and depression in adulthood using data from the Gray Trauma Project.
中介分析一直是一种有用的工具,可用于研究位于自变量和结果之间的中介变量的影响。随着(表观)基因组学研究中中介变量的维度不断增加,需要高维中介模型。在这项工作中,我们专注于表观遗传学研究,旨在确定重要的 DNA 甲基化作为暴露疾病结果之间的中介。具体来说,我们专注于基于基因的高维中介分析,采用核主成分分析来捕捉潜在的非线性中介效应。我们首先回顾当前的高维中介模型,然后提出两种基于基因的分析方法:基于线性假设的基于基因的高维中介分析(gHMA-L)和基于非线性假设的基于基因的高维中介分析(gHMA-NL)。由于实践中未知潜在的真实中介关系,我们通过组合 gHMA-L 和 gHMA-NL 进一步提出了基于基因的高维中介分析(gHMA-O)的总检验。广泛的模拟研究表明,gHMA-L 在模型线性假设下表现更好,gHMA-NL 在模型非线性假设下表现更好,而 gHMA-O 通过结合两者是一种更强大和稳健的方法。我们将提出的方法应用于两个数据集,以研究甲基化水平作为重要中介物的基因在以下关系中的作用:(1)在梅奥诊所卵巢癌病例对照研究中使用的数据,酒精消费和上皮性卵巢癌风险之间的关系;(2)在童年期创伤与成年期并发创伤后应激障碍和抑郁症之间的关系,使用来自格雷创伤项目的数据。