Suppr超能文献

用于胰腺癌生存预测的机器学习模型:使用监测、流行病学和最终结果(SEER)数据库对各阶段和治疗方法进行的多模型分析

Machine Learning Models for Pancreatic Cancer Survival Prediction: A Multi-Model Analysis Across Stages and Treatments Using the Surveillance, Epidemiology, and End Results (SEER) Database.

作者信息

Chakraborty Aditya, Pant Mohan D

机构信息

Department of Epidemiology, Biostatistics, and Environmental Health, Joint School of Public Health, Old Dominion University, Norfolk, VA 23529, USA.

出版信息

J Clin Med. 2025 Jul 2;14(13):4686. doi: 10.3390/jcm14134686.

Abstract

Pancreatic cancer is among the most lethal malignancies, with poor prognosis and limited survival despite treatment advances. Accurate survival modeling is critical for prognostication and clinical decision-making. This study had three primary aims: (1) to determine the best-fitting survival distribution among patients diagnosed and deceased from pancreatic cancer across stages and treatment types; (2) to construct and compare predictive risk classification models; and (3) to evaluate survival probabilities using parametric, semi-parametric, non-parametric, machine learning, and deep learning methods for Stage IV patients receiving both chemotherapy and radiation. Using data from the SEER database, parametric models (Generalized Extreme Value, Generalized Pareto, Log-Pearson 3), semi-parametric (Cox), and non-parametric (Kaplan-Meier) methods were compared with four machine learning models (gradient boosting, neural network, elastic net, and random forest). Survival probability heatmaps were constructed, and six classification models were developed for risk stratification. ROC curves, accuracy, and goodness-of-fit tests were used for model validation. Statistical tests included Kruskal-Wallis, pairwise Wilcoxon, and chi-square. Generalized Extreme Value (GEV) was found to be the best-fitting distribution in most of the scenarios. Stage-specific survival differences were statistically significant. The highest predictive accuracy (AUC: 0.947; accuracy: 56.8%) was observed in patients receiving both chemotherapy and radiation. The gradient boosting model predicted the most optimistic survival, while random forest showed a sharp decline after 15 months. This study emphasizes the importance of selecting appropriate analytical models for survival prediction and risk classification. Adopting these innovations, with the help of advanced machine learning and deep learning models, can enhance patient outcomes and advance precision medicine initiatives.

摘要

胰腺癌是最致命的恶性肿瘤之一,尽管治疗取得了进展,但预后较差,生存期有限。准确的生存建模对于预后评估和临床决策至关重要。本研究有三个主要目标:(1)确定不同分期和治疗类型的胰腺癌诊断和死亡患者中最适合的生存分布;(2)构建并比较预测风险分类模型;(3)使用参数法、半参数法、非参数法、机器学习和深度学习方法评估接受化疗和放疗的IV期患者的生存概率。利用监测、流行病学与最终结果(SEER)数据库的数据,将参数模型(广义极值分布、广义帕累托分布、对数皮尔逊3分布)、半参数模型(Cox模型)和非参数模型(Kaplan-Meier模型)与四种机器学习模型(梯度提升、神经网络、弹性网络和随机森林)进行比较。构建了生存概率热图,并开发了六个用于风险分层的分类模型。使用ROC曲线、准确性和拟合优度检验进行模型验证。统计检验包括Kruskal-Wallis检验、两两Wilcoxon检验和卡方检验。发现广义极值分布在大多数情况下是最适合的分布。特定分期的生存差异具有统计学意义。在接受化疗和放疗的患者中观察到最高的预测准确性(AUC:0.947;准确性:56.8%)。梯度提升模型预测的生存期最乐观,而随机森林模型显示在15个月后生存期急剧下降。本研究强调了选择合适的分析模型进行生存预测和风险分类的重要性。借助先进的机器学习和深度学习模型采用这些创新方法,可以改善患者预后并推进精准医疗计划。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7dbf/12250171/6c030d4be69e/jcm-14-04686-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验