统计和机器学习风险预测模型在乳腺癌幸存者监测获益和失败中的性能。

Performance of Statistical and Machine Learning Risk Prediction Models for Surveillance Benefits and Failures in Breast Cancer Survivors.

机构信息

Kaiser Permanente Washington Health Research Institute, Kaiser Permanente WA, Seattle, Washington.

Department of Radiology, University of Washington and Seattle Cancer Care Alliance, Seattle, Washington.

出版信息

Cancer Epidemiol Biomarkers Prev. 2023 Apr 3;32(4):561-571. doi: 10.1158/1055-9965.EPI-22-0677.

DOI:10.1158/1055-9965.EPI-22-0677

PMID:36697364

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10073265/

Abstract

BACKGROUND

Machine learning (ML) approaches facilitate risk prediction model development using high-dimensional predictors and higher-order interactions at the cost of model interpretability and transparency. We compared the relative predictive performance of statistical and ML models to guide modeling strategy selection for surveillance mammography outcomes in women with a personal history of breast cancer (PHBC).

METHODS

We cross-validated seven risk prediction models for two surveillance outcomes, failure (breast cancer within 12 months of a negative surveillance mammogram) and benefit (surveillance-detected breast cancer). We included 9,447 mammograms (495 failures, 1,414 benefits, and 7,538 nonevents) from years 1996 to 2017 using a 1:4 matched case-control samples of women with PHBC in the Breast Cancer Surveillance Consortium. We assessed model performance of conventional regression, regularized regressions (LASSO and elastic-net), and ML methods (random forests and gradient boosting machines) by evaluating their calibration and, among well-calibrated models, comparing the area under the receiver operating characteristic curve (AUC) and 95% confidence intervals (CI).

RESULTS

LASSO and elastic-net consistently provided well-calibrated predicted risks for surveillance failure and benefit. The AUCs of LASSO and elastic-net were both 0.63 (95% CI, 0.60-0.66) for surveillance failure and 0.66 (95% CI, 0.64-0.68) for surveillance benefit, the highest among well-calibrated models.

CONCLUSIONS

For predicting breast cancer surveillance mammography outcomes, regularized regression outperformed other modeling approaches and balanced the trade-off between model flexibility and interpretability.

IMPACT

Regularized regression may be preferred for developing risk prediction models in other contexts with rare outcomes, similar training sample sizes, and low-dimensional features.

摘要

背景

机器学习（ML）方法通过使用高维预测因子和高阶交互作用来促进风险预测模型的开发，但代价是模型的可解释性和透明度降低。我们比较了统计和 ML 模型的相对预测性能，以指导具有乳腺癌个人史（PHBC）的女性进行监测乳房 X 线照片结果的建模策略选择。

方法

我们使用 1996 年至 2017 年期间乳腺癌监测联盟中 PHBC 女性的 1：4 匹配病例对照样本，对两种监测结果（失败[阴性监测乳房 X 线照片后 12 个月内发生乳腺癌]和获益[监测发现的乳腺癌]）的七个风险预测模型进行了交叉验证。我们纳入了 9447 例乳房 X 线照片（495 例失败，1414 例获益和 7538 例无事件）。我们评估了常规回归、正则化回归（LASSO 和弹性网络）和 ML 方法（随机森林和梯度提升机）的模型性能，方法是评估其校准情况，并在具有良好校准的模型中比较接受者操作特征曲线（ROC）下的面积（AUC）和 95%置信区间（CI）。

结果

LASSO 和弹性网络一致地为监测失败和获益提供了校准良好的预测风险。LASSO 和弹性网络的 AUC 对于监测失败均为 0.63（95%CI，0.60-0.66），对于监测获益为 0.66（95%CI，0.64-0.68），在具有良好校准的模型中均为最高。

结论

对于预测乳腺癌监测乳房 X 线照片结果，正则化回归优于其他建模方法，并在模型灵活性和可解释性之间取得了平衡。

影响

在具有罕见结局、相似训练样本量和低维特征的其他情况下，正则化回归可能更适合开发风险预测模型。

相似文献

Performance of Statistical and Machine Learning Risk Prediction Models for Surveillance Benefits and Failures in Breast Cancer Survivors.

Cancer Epidemiol Biomarkers Prev. 2023 Apr 3;32(4):561-571. doi: 10.1158/1055-9965.EPI-22-0677.

Predicting five-year interval second breast cancer risk in women with prior breast cancer.

J Natl Cancer Inst. 2024 Jun 7;116(6):929-937. doi: 10.1093/jnci/djae063.

Breast Biopsy Intensity and Findings Following Breast Cancer Screening in Women With and Without a Personal History of Breast Cancer.

JAMA Intern Med. 2018 Apr 1;178(4):458-468. doi: 10.1001/jamainternmed.2017.8549.

Deep learning modeling using normal mammograms for predicting breast cancer risk.

Med Phys. 2020 Jan;47(1):110-118. doi: 10.1002/mp.13886. Epub 2019 Nov 19.

Identifying predictive factors for neuropathic pain after breast cancer surgery using machine learning.

Int J Med Inform. 2020 Sep;141:104170. doi: 10.1016/j.ijmedinf.2020.104170. Epub 2020 Jun 13.

Performance of Breast Cancer Risk-Assessment Models in a Large Mammography Cohort.

J Natl Cancer Inst. 2020 May 1;112(5):489-497. doi: 10.1093/jnci/djz177.

Patterns of Breast Imaging Use Among Women with a Personal History of Breast Cancer.

J Gen Intern Med. 2019 Oct;34(10):2098-2106. doi: 10.1007/s11606-019-05181-6. Epub 2019 Aug 13.

Surveillance Mammography Behaviors in Black and White Breast Cancer Survivors: Behavioral Risk Factors and Surveillance System, 2016.

BMC Womens Health. 2023 Mar 30;23(1):148. doi: 10.1186/s12905-023-02246-x.

A two-stage modeling approach for breast cancer survivability prediction.

Int J Med Inform. 2021 May;149:104438. doi: 10.1016/j.ijmedinf.2021.104438. Epub 2021 Mar 11.

A Mixed Method Approach to Examine Surveillance Mammography Experiences in Black and White Breast Cancer Survivors.

Clin Breast Cancer. 2022 Dec;22(8):801-811. doi: 10.1016/j.clbc.2022.08.009. Epub 2022 Aug 27.

引用本文的文献

Development and validation of prediction models for sentinel lymph node status indicating postmastectomy radiotherapy in breast cancer: population-based study.

BJS Open. 2025 Mar 4;9(2). doi: 10.1093/bjsopen/zraf047.

Predicting five-year interval second breast cancer risk in women with prior breast cancer.

J Natl Cancer Inst. 2024 Jun 7;116(6):929-937. doi: 10.1093/jnci/djae063.

Sources of Disparities in Surveillance Mammography Performance and Risk-Guided Recommendations for Supplemental Breast Imaging: A Simulation Study.

Cancer Epidemiol Biomarkers Prev. 2023 Nov 1;32(11):1531-1541. doi: 10.1158/1055-9965.EPI-23-0330.

本文引用的文献

Digital Mammography and Breast Tomosynthesis Performance in Women with a Personal History of Breast Cancer, 2007-2016.

Radiology. 2021 Aug;300(2):290-300. doi: 10.1148/radiol.2021204581. Epub 2021 May 18.

Essentialism and Exclusion: Racism in Cancer Risk Prediction Models.

J Natl Cancer Inst. 2021 Nov 29;113(12):1620-1624. doi: 10.1093/jnci/djab074.

Embracing Genetic Diversity to Improve Black Health.

N Engl J Med. 2021 Mar 25;384(12):1163-1167. doi: 10.1056/NEJMms2031080. Epub 2021 Feb 10.

Factors to Consider in Developing Breast Cancer Risk Models to Implement into Clinical Care.

Curr Epidemiol Rep. 2020 Jun;7(2):113-116. doi: 10.1007/s40471-020-00230-9. Epub 2020 Apr 29.

Multiple Myeloma, Version 3.2021, NCCN Clinical Practice Guidelines in Oncology.

J Natl Compr Canc Netw. 2020 Dec 2;18(12):1685-1717. doi: 10.6004/jnccn.2020.0057.

Racial/Ethnic Disparities in All-Cause Mortality among Patients Diagnosed with Triple-Negative Breast Cancer.

Cancer Res. 2021 Feb 15;81(4):1163-1170. doi: 10.1158/0008-5472.CAN-20-3094. Epub 2020 Dec 3.

Hidden in Plain Sight - Reconsidering the Use of Race Correction in Clinical Algorithms.

N Engl J Med. 2020 Aug 27;383(9):874-882. doi: 10.1056/NEJMms2004740. Epub 2020 Jun 17.

Predictably unequal: understanding and addressing concerns that algorithmic clinical prediction may increase health disparities.

NPJ Digit Med. 2020 Jul 30;3:99. doi: 10.1038/s41746-020-0304-9. eCollection 2020.

Breast Cancer Population Attributable Risk Proportions Associated with Body Mass Index and Breast Density by Race/Ethnicity and Menopausal Status.

Cancer Epidemiol Biomarkers Prev. 2020 Oct;29(10):2048-2056. doi: 10.1158/1055-9965.EPI-20-0358. Epub 2020 Jul 29.

Machine learning-based lifetime breast cancer risk reclassification compared with the BOADICEA model: impact on screening recommendations.

Br J Cancer. 2020 Sep;123(5):860-867. doi: 10.1038/s41416-020-0937-0. Epub 2020 Jun 22.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

统计和机器学习风险预测模型在乳腺癌幸存者监测获益和失败中的性能。

Performance of Statistical and Machine Learning Risk Prediction Models for Surveillance Benefits and Failures in Breast Cancer Survivors.

机构信息

出版信息

BACKGROUND

METHODS

RESULTS

CONCLUSIONS

IMPACT

背景

方法

结果

结论

影响

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献