Rosen Jonathan D, Broadaway K Alaine, Brotman Sarah M, Mohlke Karen L, Love Michael I
Department of Genetics, University of North Carolina, Chapel Hill, NC, 27599, USA.
These authors contributed equally.
bioRxiv. 2025 Aug 5:2025.08.05.668745. doi: 10.1101/2025.08.05.668745.
Expression quantitative trait locus (eQTL) studies in human cohorts typically detect at least one regulatory signal per gene, and have been proposed as a way to explain mechanisms of genetic liability for other traits, as discovered in genome-wide association studies (GWAS). In particular, eQTL signals may colocalize with GWAS signals, suggesting gene expression as a possible mediator. However, recent studies have noted colocalization occurs infrequently, even when expression is measured in biologically relevant tissues. Most eQTL studies to date include only hundreds of individuals, and are underpowered to discover distal regulatory signals explaining smaller fractions of gene expression variance. We integrate evidence from recent eQTL studies and demonstrate that limited statistical power due to sample size skews the detection of eQTL signals identified at various signal strengths. We estimate that a sample size of 500 detects <0.1 to 60% of eQTL for a range of signal strengths and that a sample size of 2,000 would detect 36.8% of all eQTL. We show that eQTL signals that can only be discovered in larger studies exhibit characteristics more similar to those of GWAS signals, including greater distance to the regulated gene and higher probability of loss intolerance. Finally, using results from recent eQTL studies and meta-analyses, we observe a large increase in detected colocalizations with GWAS signals compared to previous studies. These findings caution against overinterpreting the absence of colocalization in underpowered studies and provide guidance for designing future eQTL experiments, to improve power and complement perturbation-based approaches in characterizing gene-trait mechanisms.
在人类队列中的表达定量性状位点(eQTL)研究通常每个基因至少检测到一个调控信号,并被提议作为解释全基因组关联研究(GWAS)中发现的其他性状遗传易感性机制的一种方法。特别是,eQTL信号可能与GWAS信号共定位,这表明基因表达可能是一种介导因素。然而,最近的研究指出,即使在生物学相关组织中测量表达,共定位也很少发生。迄今为止,大多数eQTL研究只纳入了数百名个体,发现远端调控信号的能力不足,这些信号只能解释一小部分基因表达变异。我们整合了近期eQTL研究的证据,并证明由于样本量导致的统计能力有限会使在各种信号强度下识别出的eQTL信号的检测产生偏差。我们估计,样本量为500时,对于一系列信号强度,能检测到的eQTL不到0.1%至60%,而样本量为2000时,能检测到所有eQTL的36.8%。我们表明,只有在更大规模研究中才能发现的eQTL信号表现出与GWAS信号更相似的特征,包括与受调控基因的距离更远以及不耐缺失的概率更高。最后,利用近期eQTL研究和荟萃分析的结果,我们观察到与之前的研究相比,检测到的与GWAS信号共定位的情况大幅增加。这些发现提醒我们要避免对功效不足的研究中未出现共定位现象过度解读,并为设计未来的eQTL实验提供指导,以提高功效并补充基于扰动的方法来表征基因-性状机制。