基于 CT 图像的非小细胞肺癌组织病理亚型预测的影像组学特征分析及模型研究：多数据集研究

Radiomics feature analysis and model research for predicting histopathological subtypes of non-small cell lung cancer on CT images: A multi-dataset study.

机构信息

Beijing Advanced Innovation Center for Biomedical Engineering, School of Biological Science and Medical Engineering, Beihang University, Beijing, China.

School of Medical Imaging, Shanxi Medical University, Taiyuan, China.

出版信息

Med Phys. 2023 Jul;50(7):4351-4365. doi: 10.1002/mp.16233. Epub 2023 Feb 1.

DOI:10.1002/mp.16233

PMID:36682051

Abstract

PURPOSE

Classifying the subtypes of non-small cell lung cancer (NSCLC) is essential for clinically adopting optimal treatment strategies and improving clinical outcomes, but the histological subtypes are confirmed by invasive biopsy or post-operative examination at present. Based on multi-center data, this study aimed to analyze the importance of extracted CT radiomics features and develop the model with good generalization performance for precisely distinguishing major NSCLC subtypes: adenocarcinoma (ADC) and squamous cell carcinoma (SCC).

METHODS

We collected a multi-center CT dataset with 868 patients from eight international databases on the cancer imaging archive (TCIA). Among them, patients from five databases were mixed and split to training and test sets (560:140). The remaining three databases were used as independent test sets: TCGA set (n = 97) and lung3 set (n = 71). A total of 1409 features containing shape, intensity, and texture information were extracted from tumor volume of interest (VOI), then the ℓ -norm minimization was used for feature selection and the importance of selected features was analyzed. Next, the prediction and generalization performance of 130 radiomics models (10 common algorithms and 120 heterogeneous ensemble combinations) were compared by the average AUC value on three test sets. Finally, predictive results of the optimal model were shown.

RESULTS

After feature selection, 401 features were obtained. Features of intensity, texture GLCM, GLRLM, and GLSZM had higher classification weight coefficients than other features (shape, texture GLDM, and NGTDM), and the filtered image features exhibited significant importance than original image features (p-value = 0.0210). Moreover, five ensemble learning algorithms (Bagging, AdaBoost, RF, XGBoost, GBDT) had better generalization performance (p-value = 0.00418) than other non-ensemble algorithms (MLP, LR, GNB, SVM, KNN). The Bagging-AdaBoost-SVM model had the highest AUC value (0.815 ± 0.010) on three test sets. It obtained AUC values of 0.819, 0.823, and 0.804 on test set, TCGA set and lung3 set, respectively.

CONCLUSION

Our multi-dataset study showed that intensity features, texture features (GLCM, GLRLM, and GLSZM) and filtered image features were more important for distinguishing ADCs from SCCs. The method of ensemble learning can improve the prediction and generalization performance on the complicated multi-center data. The Bagging-AdaBoost-SVM model had the strongest generalization performance, and it showed promising clinical value for non-invasively predicting the histopathological subtypes of NSCLC.

摘要

目的

对非小细胞肺癌（NSCLC）进行亚型分类对于临床采用最佳治疗策略和改善临床结局至关重要，但目前主要通过有创活检或术后检查来确定组织学亚型。基于多中心数据，本研究旨在分析提取 CT 放射组学特征的重要性，并开发具有良好泛化性能的模型，以准确区分 NSCLC 的主要亚型：腺癌（ADC）和鳞状细胞癌（SCC）。

方法

我们从癌症影像档案（TCIA）上的八个国际数据库中收集了来自 868 名患者的多中心 CT 数据集。其中，来自五个数据库的患者被混合并分为训练集和测试集（560:140）。其余三个数据库被用作独立测试集：TCGA 集（n=97）和 lung3 集（n=71）。从肿瘤感兴趣区（VOI）中提取了包含形状、强度和纹理信息的 1409 个特征，然后使用 ℓ-范数最小化进行特征选择，并分析所选特征的重要性。接下来，通过三个测试集上的平均 AUC 值比较了 130 个放射组学模型（10 个常用算法和 120 个异构集成组合）的预测和泛化性能。最后，展示了最优模型的预测结果。

结果

经过特征选择，得到 401 个特征。与其他特征（形状、纹理 GLDM 和 NGTDM）相比，强度、纹理 GLCM、GLRLM 和 GLSZM 特征的分类权重系数更高，过滤后的图像特征比原始图像特征更重要（p 值=0.0210）。此外，五种集成学习算法（Bagging、AdaBoost、RF、XGBoost、GBDT）的泛化性能（p 值=0.00418）优于其他非集成算法（MLP、LR、GNB、SVM、KNN）。Bagging-AdaBoost-SVM 模型在三个测试集上的 AUC 值最高（0.815±0.010）。在测试集、TCGA 集和 lung3 集上，它分别获得了 0.819、0.823 和 0.804 的 AUC 值。

结论

我们的多数据集研究表明，对于区分 ADC 和 SCC，强度特征、纹理特征（GLCM、GLRLM 和 GLSZM）和过滤后的图像特征更为重要。集成学习方法可以提高对复杂多中心数据的预测和泛化性能。Bagging-AdaBoost-SVM 模型具有最强的泛化性能，它在非侵入性预测 NSCLC 的组织病理学亚型方面具有有前景的临床价值。

相似文献

Radiomics feature analysis and model research for predicting histopathological subtypes of non-small cell lung cancer on CT images: A multi-dataset study.

Med Phys. 2023 Jul;50(7):4351-4365. doi: 10.1002/mp.16233. Epub 2023 Feb 1.

Multi-subtype classification model for non-small cell lung cancer based on radiomics: SLS model.

Med Phys. 2019 Jul;46(7):3091-3100. doi: 10.1002/mp.13551. Epub 2019 May 11.

Dual-Centre Harmonised Multimodal Positron Emission Tomography/Computed Tomography Image Radiomic Features and Machine Learning Algorithms for Non-small Cell Lung Cancer Histopathological Subtype Phenotype Decoding.

Clin Oncol (R Coll Radiol). 2023 Nov;35(11):713-725. doi: 10.1016/j.clon.2023.08.003. Epub 2023 Aug 8.

Impact of feature selection methods and subgroup factors on prognostic analysis with CT-based radiomics in non-small cell lung cancer patients.

Radiat Oncol. 2021 Apr 30;16(1):80. doi: 10.1186/s13014-021-01810-9.

Next-Generation Radiogenomics Sequencing for Prediction of EGFR and KRAS Mutation Status in NSCLC Patients Using Multimodal Imaging and Machine Learning Algorithms.

Mol Imaging Biol. 2020 Aug;22(4):1132-1148. doi: 10.1007/s11307-020-01487-8.

Radiomics for Classification of Lung Cancer Histological Subtypes Based on Nonenhanced Computed Tomography.

Acad Radiol. 2019 Sep;26(9):1245-1252. doi: 10.1016/j.acra.2018.10.013. Epub 2018 Nov 28.

Histologic subtype classification of non-small cell lung cancer using PET/CT images.

Eur J Nucl Med Mol Imaging. 2021 Feb;48(2):350-360. doi: 10.1007/s00259-020-04771-5. Epub 2020 Aug 10.

Machine learning-based radiomics strategy for prediction of cell proliferation in non-small cell lung cancer.

Eur J Radiol. 2019 Sep;118:32-37. doi: 10.1016/j.ejrad.2019.06.025. Epub 2019 Jun 28.

Intra-tumoural heterogeneity characterization through texture and colour analysis for differentiation of non-small cell lung carcinoma subtypes.

Phys Med Biol. 2018 Aug 22;63(16):165018. doi: 10.1088/1361-6560/aad648.

Intratumoral and peritumoral CT-based radiomics strategy reveals distinct subtypes of non-small-cell lung cancer.

J Cancer Res Clin Oncol. 2022 Sep;148(9):2247-2260. doi: 10.1007/s00432-022-04015-z. Epub 2022 Apr 17.

引用本文的文献

Multimodal radiomics fusion for predicting postoperative recurrence in NSCLC patients.

J Cancer Res Clin Oncol. 2025 Sep 18;151(10):261. doi: 10.1007/s00432-025-06311-w.

CT Radiomics-based machine learning approach for the invasiveness of pulmonary ground-glass nodules prediction.

Eur J Radiol Open. 2025 Aug 23;15:100680. doi: 10.1016/j.ejro.2025.100680. eCollection 2025 Dec.

Application of Fractal Radiomics and Machine Learning for Differentiation of Non-Small Cell Lung Cancer Subtypes on PET/MR Images.

J Clin Med. 2025 Aug 15;14(16):5776. doi: 10.3390/jcm14165776.

Radiomics-based machine learning for differentiating lung squamous cell carcinoma and adenocarcinoma using T1-enhanced MRI of brain metastases.

Front Oncol. 2025 Jul 23;15:1599853. doi: 10.3389/fonc.2025.1599853. eCollection 2025.

Predicting pathological staging of non-small cell lung cancer using a multi-task radiomics model integrating intratumoral and peritumoral features.

Oncol Lett. 2025 Jul 7;30(3):431. doi: 10.3892/ol.2025.15177. eCollection 2025 Sep.

Decoding the Rotation Effect: A Retrospective Analysis of Lesion Orientation and Its Impact on Wavelet-Based Radiomics Feature Extraction and Lung Cancer Classification.

J Imaging Inform Med. 2025 May 6. doi: 10.1007/s10278-025-01520-8.

The influence of image selection and segmentation on the extraction of lung cancer imaging radiomics features using 3D-Slicer software.

BMC Cancer. 2025 Apr 17;25(1):728. doi: 10.1186/s12885-025-14094-z.

Integration of Nuclear, Clinical, and Genetic Features for Lung Cancer Subtype Classification and Survival Prediction Based on Machine- and Deep-Learning Models.

Diagnostics (Basel). 2025 Mar 28;15(7):872. doi: 10.3390/diagnostics15070872.

Hybrid Approach to Classifying Histological Subtypes of Non-small Cell Lung Cancer (NSCLC): Combining Radiomics and Deep Learning Features from CT Images.

J Imaging Inform Med. 2025 Feb 14. doi: 10.1007/s10278-025-01442-5.

Radiomics in distinguishing between lung adenocarcinoma and lung squamous cell carcinoma: a systematic review and meta-analysis.

Front Oncol. 2024 Sep 24;14:1381217. doi: 10.3389/fonc.2024.1381217. eCollection 2024.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

基于 CT 图像的非小细胞肺癌组织病理亚型预测的影像组学特征分析及模型研究：多数据集研究

Radiomics feature analysis and model research for predicting histopathological subtypes of non-small cell lung cancer on CT images: A multi-dataset study.

机构信息

出版信息

PURPOSE

METHODS

RESULTS

CONCLUSION

目的

方法

结果

结论

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献