Suppr超能文献

基于机器学习和深度学习模型的肺癌亚型分类与生存预测的核、临床和遗传特征整合

Integration of Nuclear, Clinical, and Genetic Features for Lung Cancer Subtype Classification and Survival Prediction Based on Machine- and Deep-Learning Models.

作者信息

Xie Bin, Mo Mingda, Cui Haidong, Dong Yijie, Yin Hongping, Lu Zhe

机构信息

School of Information Science and Technology, Hangzhou Normal University, Hangzhou 311121, China.

Department of Breast Surgery, The First Affiliated Hospital, School of Medicine, Zhejiang University, Hangzhou 311121, China.

出版信息

Diagnostics (Basel). 2025 Mar 28;15(7):872. doi: 10.3390/diagnostics15070872.

Abstract

Lung cancer is one of the most prevalent cancers worldwide. Accurately determining lung cancer subtypes and identifying high-risk patients are helpful for individualized treatment and follow-up. Our study aimed to establish an effective model for subtype classification and overall survival (OS) prediction in patients with lung cancer. Histopathological images, clinical data, and genetic information of lung adenocarcinoma and lung squamous cell carcinoma cases were downloaded from The Cancer Genome Atlas. An influencing factor system was optimized based on the nuclear, clinical, and genetic features. Four machine-learning models-light gradient boosting machine (LightGBM), extreme gradient boosting (XGBoost), random forest (RF), and adaptive boosting (AdaBoost)-and three deep-learning models-multilayer perceptron (MLP), TabNet, and convolutional neural network (CNN)-were employed for subtype classification and OS prediction. The performance of the models was comprehensively evaluated. XGBoost exhibited the highest area under the curve (AUC) value of 0.9821 in subtype classification, whereas RF exhibited the highest AUC values of 0.9134, 0.8706, and 0.8765 in predicting OS at 1, 2, and 3 years, respectively. Our study was the first to incorporate the characteristics of nuclei and the genetic information of patients to predict the subtypes and OS of patients with lung cancer. The combination of different factors and the usage of artificial intelligence methods achieved a small breakthrough in the results of previous studies regarding AUC values.

摘要

肺癌是全球最常见的癌症之一。准确确定肺癌亚型并识别高危患者有助于个体化治疗和随访。我们的研究旨在建立一种有效的模型,用于肺癌患者的亚型分类和总生存期(OS)预测。从癌症基因组图谱下载了肺腺癌和肺鳞状细胞癌病例的组织病理学图像、临床数据和基因信息。基于细胞核、临床和基因特征优化了一个影响因素系统。使用四个机器学习模型——轻梯度提升机(LightGBM)、极端梯度提升(XGBoost)、随机森林(RF)和自适应提升(AdaBoost)——以及三个深度学习模型——多层感知器(MLP)、TabNet和卷积神经网络(CNN)——进行亚型分类和OS预测。对模型的性能进行了全面评估。在亚型分类中,XGBoost的曲线下面积(AUC)值最高,为0.9821;而在预测1年、2年和3年的OS时,RF的AUC值分别最高,为0.9134、0.8706和0.8765。我们的研究首次纳入细胞核特征和患者基因信息来预测肺癌患者的亚型和OS。不同因素的组合以及人工智能方法的使用在先前研究的AUC值结果方面取得了小的突破。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/afc4/11988547/63fd6f998390/diagnostics-15-00872-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验