Intramural Research Program, National Library of Medicine, National Institutes of Health, Bethesda, MD, 20892, USA.
Genome Biol. 2024 Jul 8;25(1):184. doi: 10.1186/s13059-024-03328-1.
Although disease-causal genetic variants have been found within silencer sequences, we still lack a comprehensive analysis of the association of silencers with diseases. Here, we profiled GWAS variants in 2.8 million candidate silencers across 97 human samples derived from a diverse panel of tissues and developmental time points, using deep learning models.
We show that candidate silencers exhibit strong enrichment in disease-associated variants, and several diseases display a much stronger association with silencer variants than enhancer variants. Close to 52% of candidate silencers cluster, forming silencer-rich loci, and, in the loci of Parkinson's-disease-hallmark genes TRIM31 and MAL, the associated SNPs densely populate clustered candidate silencers rather than enhancers displaying an overall twofold enrichment in silencers versus enhancers. The disruption of apoptosis in neuronal cells is associated with both schizophrenia and bipolar disorder and can largely be attributed to variants within candidate silencers. Our model permits a mechanistic explanation of causative SNP effects by identifying altered binding of tissue-specific repressors and activators, validated with a 70% of directional concordance using SNP-SELEX. Narrowing the focus of the analysis to individual silencer variants, experimental data confirms the role of the rs62055708 SNP in Parkinson's disease, rs2535629 in schizophrenia, and rs6207121 in type 1 diabetes.
In summary, our results indicate that advances in deep learning models for the discovery of disease-causal variants within candidate silencers effectively "double" the number of functionally characterized GWAS variants. This provides a basis for explaining mechanisms of action and designing novel diagnostics and therapeutics.
尽管已经在沉默子序列中发现了与疾病相关的遗传变异,但我们仍然缺乏对沉默子与疾病之间关联的全面分析。在这里,我们使用深度学习模型对来自多样化组织和发育时间点的 97 个人类样本中的 280 万个候选沉默子的 GWAS 变体进行了分析。
我们表明,候选沉默子在疾病相关变体中表现出强烈的富集,并且几种疾病与沉默子变体的关联比增强子变体要强得多。接近 52%的候选沉默子聚类,形成沉默子丰富的基因座,在帕金森病标志性基因 TRIM31 和 MAL 的基因座中,相关的 SNP 密集地分布在聚类的候选沉默子中,而不是在增强子中,在增强子中相对于增强子,沉默子的总体富集度增加了两倍。神经元细胞凋亡的破坏与精神分裂症和双相情感障碍都有关联,并且很大程度上可以归因于候选沉默子内的变体。我们的模型通过识别组织特异性抑制剂和激活剂的结合改变,为因果 SNP 效应提供了一种机制解释,使用 SNP-SELEX 验证了 70%的定向一致性。将分析重点缩小到单个沉默子变体,实验数据证实了 rs62055708 SNP 在帕金森病中的作用、rs2535629 在精神分裂症中的作用以及 rs6207121 在 1 型糖尿病中的作用。
总之,我们的研究结果表明,用于发现候选沉默子中疾病因果变异的深度学习模型的进步有效地将功能表征的 GWAS 变异数量“增加一倍”。这为解释作用机制和设计新型诊断和治疗方法提供了基础。