Broad Institute of MIT and Harvard, Cambridge, MA, USA.
Department of Mathematics, Massachusetts Institute of Technology, Cambridge, MA, USA.
Nat Genet. 2018 Apr;50(4):621-629. doi: 10.1038/s41588-018-0081-4. Epub 2018 Apr 9.
We introduce an approach to identify disease-relevant tissues and cell types by analyzing gene expression data together with genome-wide association study (GWAS) summary statistics. Our approach uses stratified linkage disequilibrium (LD) score regression to test whether disease heritability is enriched in regions surrounding genes with the highest specific expression in a given tissue. We applied our approach to gene expression data from several sources together with GWAS summary statistics for 48 diseases and traits (average N = 169,331) and found significant tissue-specific enrichments (false discovery rate (FDR) < 5%) for 34 traits. In our analysis of multiple tissues, we detected a broad range of enrichments that recapitulated known biology. In our brain-specific analysis, significant enrichments included an enrichment of inhibitory over excitatory neurons for bipolar disorder, and excitatory over inhibitory neurons for schizophrenia and body mass index. Our results demonstrate that our polygenic approach is a powerful way to leverage gene expression data for interpreting GWAS signals.
我们提出了一种通过分析基因表达数据和全基因组关联研究(GWAS)汇总统计数据来识别与疾病相关的组织和细胞类型的方法。我们的方法使用分层连锁不平衡(LD)得分回归来测试疾病遗传率是否在具有特定组织中最高表达的基因周围区域富集。我们将我们的方法应用于来自多个来源的基因表达数据以及 48 种疾病和特征的 GWAS 汇总统计数据(平均 N=169331),并发现 34 种特征具有显著的组织特异性富集(错误发现率(FDR)<5%)。在我们对多种组织的分析中,我们检测到广泛的富集,这些富集再现了已知的生物学。在我们的大脑特异性分析中,显著的富集包括双相情感障碍中抑制性神经元多于兴奋性神经元,精神分裂症和体重指数中兴奋性神经元多于抑制性神经元。我们的结果表明,我们的多基因方法是利用基因表达数据解释 GWAS 信号的一种强大方法。