Suppr超能文献

利用基因推断的血统改进多基因预测。

Improving polygenic prediction with genetically inferred ancestry.

作者信息

Naret Olivier, Kutalik Zoltan, Hodel Flavia, Xu Zhi Ming, Marques-Vidal Pedro, Fellay Jacques

机构信息

School of Life Sciences, École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland.

Swiss Institute of Bioinformatics, Lausanne, Switzerland.

出版信息

HGG Adv. 2022 Apr 20;3(3):100109. doi: 10.1016/j.xhgg.2022.100109. eCollection 2022 Jul 14.

Abstract

Genome-wide association studies (GWASs) have demonstrated that most common diseases have a strong genetic component from many genetic variants each with a small effect size. GWAS summary statistics have allowed the construction of polygenic scores (PGSs) estimating part of the individual risk for common diseases. Here, we propose to improve PGS-based risk estimation by incorporating genetic ancestry derived from genome-wide genotyping data. Our method involves three cohorts: a base (or discovery) for association studies, a target for phenotype/risk prediction, and a map for ancestry mapping; successively, (1) it generates for each individual in the base and target cohorts a set of principal components based on the map cohort-called mapped PCs, (2) it associates in the base cohort the phenotype with the mapped-PCs, and (3) it uses the mapped PCs in the target cohort to generate a phenotypic predictor called the ancestry score. We evaluated the ancestry score by comparing a predictive model using a PGS with one combining a PGS and an ancestry score. First, we performed simulations and found that the ancestry score has a greater impact on traits that correlate with ancestry-specific variants. Second, we showed, using UK Biobank data, that the ancestry score improves genetic prediction for our nine phenotypes to very different degrees. Third, we performed simulations and found that the more heterogeneous the base and target cohorts, the more beneficial the ancestry score is. Finally, we validated our approach under realistic conditions with UK Biobank as the base cohort and Swiss individuals from the CoLaus|PsyCoLaus study as the target cohort.

摘要

全基因组关联研究(GWAS)表明,大多数常见疾病都有很强的遗传成分,由许多效应大小较小的基因变异组成。GWAS汇总统计数据有助于构建多基因评分(PGS),以估计常见疾病的部分个体风险。在此,我们建议通过纳入从全基因组基因分型数据中得出的遗传血统来改进基于PGS的风险估计。我们的方法涉及三个队列:一个用于关联研究的基础(或发现)队列、一个用于表型/风险预测的目标队列以及一个用于血统映射的映射队列;随后,(1)它基于映射队列(称为映射主成分)为基础队列和目标队列中的每个个体生成一组主成分,(2)它在基础队列中将表型与映射主成分进行关联,(3)它在目标队列中使用映射主成分生成一个称为血统评分的表型预测指标。我们通过比较使用PGS的预测模型与结合了PGS和血统评分的预测模型来评估血统评分。首先,我们进行了模拟,发现血统评分对与血统特异性变异相关的性状有更大影响。其次,我们使用英国生物银行的数据表明,血统评分在不同程度上改善了我们对九种表型的遗传预测。第三,我们进行了模拟,发现基础队列和目标队列的异质性越高,血统评分的益处就越大。最后,我们在现实条件下以英国生物银行为基础队列、以来自CoLaus|PsyCoLaus研究的瑞士个体为目标队列验证了我们的方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/776d/9095896/58db5e998ef0/gr1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验