Galatea Bio, Inc, 14350 Commerce Way, Miami Lakes, FL, 33146, USA.
Department of Biomedical Data Science, Stanford University School of Medicine, 1265 Welch Road, Stanford, CA, 94305, USA.
Hum Genomics. 2024 Sep 2;18(1):93. doi: 10.1186/s40246-024-00664-y.
Polygenic risk scores (PRS) derived from European individuals have reduced portability across global populations, limiting their clinical implementation at worldwide scale. Here, we investigate the performance of a wide range of PRS models across four ancestry groups (Africans, Europeans, East Asians, and South Asians) for 14 conditions of high-medical interest.
To select the best-performing model per trait, we first compared PRS performances for publicly available scores, and constructed new models using different methods (LDpred2, PRS-CSx and SNPnet). We used 285 K European individuals from the UK Biobank (UKBB) for training and 18 K, including diverse ancestries, for testing. We then evaluated PRS portability for the best models in Europeans and compared their accuracies with respect to the best PRS per ancestry. Finally, we validated the selected PRS models using an independent set of 8,417 individuals from Biobank of the Americas-Genomelink (BbofA-GL); and performed a PRS-Phewas.
We confirmed a decay in PRS performances relative to Europeans when the evaluation was conducted using the best-PRS model for Europeans (51.3% for South Asians, 46.6% for East Asians and 39.4% for Africans). We observed an improvement in the PRS performances when specifically selecting ancestry specific PRS models (phenotype variance increase: 1.62 for Africans, 1.40 for South Asians and 0.96 for East Asians). Additionally, when we selected the optimal model conditional on ancestry for CAD, HDL-C and LDL-C, hypertension, hypothyroidism and T2D, PRS performance for studied populations was more comparable to what was observed in Europeans. Finally, we were able to independently validate tested models for Europeans, and conducted a PRS-Phewas, identifying cross-trait interplay between cardiometabolic conditions, and between immune-mediated components.
Our work comprehensively evaluated PRS accuracy across a wide range of phenotypes, reducing the uncertainty with respect to which PRS model to choose and in which ancestry group. This evaluation has let us identify specific conditions where implementing risk-prioritization strategies could have practical utility across diverse ancestral groups, contributing to democratizing the implementation of PRS.
从欧洲个体中得出的多基因风险评分(PRS)在全球人群中的可转移性降低,限制了其在全球范围内的临床应用。在这里,我们研究了广泛的 PRS 模型在四个祖先群体(非洲人、欧洲人、东亚人和南亚人)中对 14 种高医学关注疾病的表现。
为了选择每个特征表现最佳的模型,我们首先比较了公开可用评分的 PRS 性能,并使用不同的方法(LDpred2、PRS-CSx 和 SNPnet)构建了新模型。我们使用来自英国生物库(UKBB)的 28.5 万欧洲个体进行训练,并使用包括多种祖先的 1.8 万个体进行测试。然后,我们评估了最佳模型在欧洲人中的 PRS 可转移性,并将其准确性与每个祖先进化的最佳 PRS 进行了比较。最后,我们使用来自美洲生物库基因组链接(BbofA-GL)的 8417 名独立个体验证了选定的 PRS 模型,并进行了 PRS-Phewas。
我们确认,当使用欧洲人的最佳 PRS 模型进行评估时,PRS 性能相对于欧洲人会下降(南亚人 51.3%,东亚人 46.6%,非洲人 39.4%)。当专门选择特定祖先进化的 PRS 模型时,我们观察到 PRS 性能有所提高(表型方差增加:非洲人 1.62,南亚人 1.40,东亚人 0.96)。此外,当我们根据祖先为 CAD、HDL-C 和 LDL-C、高血压、甲状腺功能减退症和 T2D 选择最佳模型时,研究人群的 PRS 性能与欧洲人观察到的更相似。最后,我们能够独立验证欧洲人的测试模型,并进行了 PRS-Phewas,确定了心脏代谢疾病之间以及免疫介导成分之间的跨特征相互作用。
我们的工作全面评估了广泛表型的 PRS 准确性,降低了选择 PRS 模型和祖先进化群体的不确定性。这项评估使我们确定了在哪些特定条件下,实施风险优先排序策略在不同祖先群体中可能具有实际效用,为 PRS 的普及做出了贡献。