基于机器学习和深度学习模型的肺癌亚型分类与生存预测的核、临床和遗传特征整合

Integration of Nuclear, Clinical, and Genetic Features for Lung Cancer Subtype Classification and Survival Prediction Based on Machine- and Deep-Learning Models.

作者信息

Xie Bin, Mo Mingda, Cui Haidong, Dong Yijie, Yin Hongping, Lu Zhe

机构信息

School of Information Science and Technology, Hangzhou Normal University, Hangzhou 311121, China.

Department of Breast Surgery, The First Affiliated Hospital, School of Medicine, Zhejiang University, Hangzhou 311121, China.

出版信息

Diagnostics (Basel). 2025 Mar 28;15(7):872. doi: 10.3390/diagnostics15070872.

DOI:10.3390/diagnostics15070872

PMID:40218222

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11988547/

Abstract

Lung cancer is one of the most prevalent cancers worldwide. Accurately determining lung cancer subtypes and identifying high-risk patients are helpful for individualized treatment and follow-up. Our study aimed to establish an effective model for subtype classification and overall survival (OS) prediction in patients with lung cancer. Histopathological images, clinical data, and genetic information of lung adenocarcinoma and lung squamous cell carcinoma cases were downloaded from The Cancer Genome Atlas. An influencing factor system was optimized based on the nuclear, clinical, and genetic features. Four machine-learning models-light gradient boosting machine (LightGBM), extreme gradient boosting (XGBoost), random forest (RF), and adaptive boosting (AdaBoost)-and three deep-learning models-multilayer perceptron (MLP), TabNet, and convolutional neural network (CNN)-were employed for subtype classification and OS prediction. The performance of the models was comprehensively evaluated. XGBoost exhibited the highest area under the curve (AUC) value of 0.9821 in subtype classification, whereas RF exhibited the highest AUC values of 0.9134, 0.8706, and 0.8765 in predicting OS at 1, 2, and 3 years, respectively. Our study was the first to incorporate the characteristics of nuclei and the genetic information of patients to predict the subtypes and OS of patients with lung cancer. The combination of different factors and the usage of artificial intelligence methods achieved a small breakthrough in the results of previous studies regarding AUC values.

摘要

肺癌是全球最常见的癌症之一。准确确定肺癌亚型并识别高危患者有助于个体化治疗和随访。我们的研究旨在建立一种有效的模型，用于肺癌患者的亚型分类和总生存期（OS）预测。从癌症基因组图谱下载了肺腺癌和肺鳞状细胞癌病例的组织病理学图像、临床数据和基因信息。基于细胞核、临床和基因特征优化了一个影响因素系统。使用四个机器学习模型——轻梯度提升机（LightGBM）、极端梯度提升（XGBoost）、随机森林（RF）和自适应提升（AdaBoost）——以及三个深度学习模型——多层感知器（MLP）、TabNet和卷积神经网络（CNN）——进行亚型分类和OS预测。对模型的性能进行了全面评估。在亚型分类中，XGBoost的曲线下面积（AUC）值最高，为0.9821；而在预测1年、2年和3年的OS时，RF的AUC值分别最高，为0.9134、0.8706和0.8765。我们的研究首次纳入细胞核特征和患者基因信息来预测肺癌患者的亚型和OS。不同因素的组合以及人工智能方法的使用在先前研究的AUC值结果方面取得了小的突破。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/afc4/11988547/63fd6f998390/diagnostics-15-00872-g001.jpg

相似文献

Integration of Nuclear, Clinical, and Genetic Features for Lung Cancer Subtype Classification and Survival Prediction Based on Machine- and Deep-Learning Models.

Diagnostics (Basel). 2025 Mar 28;15(7):872. doi: 10.3390/diagnostics15070872.

Prediction of STAS in lung adenocarcinoma with nodules ≤ 2 cm using machine learning: a multicenter retrospective study.

BMC Cancer. 2025 Mar 7;25(1):417. doi: 10.1186/s12885-025-13783-z.

Construction of a predictive model for bone metastasis from first primary lung adenocarcinoma within 3 cm based on machine learning algorithm: a retrospective study.

PeerJ. 2024 Mar 14;12:e17098. doi: 10.7717/peerj.17098. eCollection 2024.

Machine learning-based models for the prediction of breast cancer recurrence risk.

BMC Med Inform Decis Mak. 2023 Nov 29;23(1):276. doi: 10.1186/s12911-023-02377-z.

Artificial intelligence in clinical care amidst COVID-19 pandemic: A systematic review.

Comput Struct Biotechnol J. 2021;19:2833-2850. doi: 10.1016/j.csbj.2021.05.010. Epub 2021 May 7.

Ultrasound deep learning radiomics and clinical machine learning models to predict low nuclear grade, ER, PR, and HER2 receptor status in pure ductal carcinoma .

Gland Surg. 2024 Apr 29;13(4):512-527. doi: 10.21037/gs-23-417. Epub 2024 Apr 11.

Explainable Machine Learning Model to Prediction EGFR Mutation in Lung Cancer.

Front Oncol. 2022 Jun 23;12:924144. doi: 10.3389/fonc.2022.924144. eCollection 2022.

Combining handcrafted features with latent variables in machine learning for prediction of radiation-induced lung damage.

Med Phys. 2019 May;46(5):2497-2511. doi: 10.1002/mp.13497. Epub 2019 Apr 8.

Noninvasive prediction of lymph node metastasis in pancreatic cancer using an ultrasound-based clinicoradiomics machine learning model.

Biomed Eng Online. 2024 Jun 18;23(1):56. doi: 10.1186/s12938-024-01259-3.

Surgical Methods and Social Factors Are Associated With Long-Term Survival in Follicular Thyroid Carcinoma: Construction and Validation of a Prognostic Model Based on Machine Learning Algorithms.

Front Oncol. 2022 Jun 21;12:816427. doi: 10.3389/fonc.2022.816427. eCollection 2022.

引用本文的文献

The Impact of Artificial Intelligence on Lung Cancer Diagnosis and Personalized Treatment.

Int J Mol Sci. 2025 Aug 31;26(17):8472. doi: 10.3390/ijms26178472.

本文引用的文献

Development and validation of a survival prediction model for patients with advanced non-small cell lung cancer based on LASSO regression.

Front Immunol. 2024 Aug 2;15:1431150. doi: 10.3389/fimmu.2024.1431150. eCollection 2024.

Leveraging Serial Low-Dose CT Scans in Radiomics-based Reinforcement Learning to Improve Early Diagnosis of Lung Cancer at Baseline Screening.

Radiol Cardiothorac Imaging. 2024 Jun;6(3):e230196. doi: 10.1148/ryct.230196.

Integration of multi-omics data for survival prediction of lung adenocarcinoma.

Comput Methods Programs Biomed. 2024 Jun;250:108192. doi: 10.1016/j.cmpb.2024.108192. Epub 2024 Apr 22.

Exploring post-COVID-19 health effects and features with advanced machine learning techniques.

Sci Rep. 2024 Apr 30;14(1):9884. doi: 10.1038/s41598-024-60504-w.

Global cancer statistics 2022: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries.

CA Cancer J Clin. 2024 May-Jun;74(3):229-263. doi: 10.3322/caac.21834. Epub 2024 Apr 4.

A feature engineering-based machine learning technique to detect and classify lung and colon cancer from histopathological images.

Med Biol Eng Comput. 2024 Mar;62(3):913-924. doi: 10.1007/s11517-023-02984-y. Epub 2023 Dec 13.

Improved graph neural network-based green anaconda optimization for segmenting and classifying the lung cancer.

Math Biosci Eng. 2023 Sep 4;20(9):17138-17157. doi: 10.3934/mbe.2023764.

Tailoring pretext tasks to improve self-supervised learning in histopathologic subtype classification of lung adenocarcinomas.

Comput Biol Med. 2023 Nov;166:107484. doi: 10.1016/j.compbiomed.2023.107484. Epub 2023 Sep 16.

External Validation of Robust Radiomic Signature to Predict 2-Year Overall Survival in Non-Small-Cell Lung Cancer.

J Digit Imaging. 2023 Dec;36(6):2519-2531. doi: 10.1007/s10278-023-00835-8. Epub 2023 Sep 21.

Radiomics feature analysis and model research for predicting histopathological subtypes of non-small cell lung cancer on CT images: A multi-dataset study.

Med Phys. 2023 Jul;50(7):4351-4365. doi: 10.1002/mp.16233. Epub 2023 Feb 1.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

基于机器学习和深度学习模型的肺癌亚型分类与生存预测的核、临床和遗传特征整合

Integration of Nuclear, Clinical, and Genetic Features for Lung Cancer Subtype Classification and Survival Prediction Based on Machine- and Deep-Learning Models.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献