Institute of Biological Psychiatry, Mental Health Center Sct. Hans, Mental Health Services Copenhagen, Roskilde, Denmark.
The Lundbeck Foundation Initiative for Integrative Psychiatric Research (iPSYCH), Copenhagen, Denmark.
Nat Commun. 2021 Sep 6;12(1):5276. doi: 10.1038/s41467-021-25014-7.
A promise of genomics in precision medicine is to provide individualized genetic risk predictions. Polygenic risk scores (PRS), computed by aggregating effects from many genomic variants, have been developed as a useful tool in complex disease research. However, the application of PRS as a tool for predicting an individual's disease susceptibility in a clinical setting is challenging because PRS typically provide a relative measure of risk evaluated at the level of a group of people but not at individual level. Here, we introduce a machine-learning technique, Mondrian Cross-Conformal Prediction (MCCP), to estimate the confidence bounds of PRS-to-disease-risk prediction. MCCP can report disease status conditional probability value for each individual and give a prediction at a desired error level. Moreover, with a user-defined prediction error rate, MCCP can estimate the proportion of sample (coverage) with a correct prediction.
精准医学中基因组学的一个承诺是提供个体化的遗传风险预测。多基因风险评分(PRS)通过聚合来自许多基因组变异的效应而被开发出来,作为复杂疾病研究中的一种有用工具。然而,PRS 作为一种在临床环境中预测个体疾病易感性的工具的应用具有挑战性,因为 PRS 通常提供在人群水平上评估的相对风险度量,而不是在个体水平上。在这里,我们引入了一种机器学习技术,蒙德里安交叉正则化预测(MCCP),以估计 PRS 与疾病风险预测的置信区间。MCCP 可以为每个人报告疾病状态条件概率值,并在期望的误差水平给出预测。此外,通过用户定义的预测误差率,MCCP 可以估计具有正确预测的样本(覆盖率)的比例。