Suppr超能文献

一种基于超声图像特征的机器学习模型,用于评估乳腺癌患者前哨淋巴结转移风险:scikit-learn和SHAP的应用

A machine learning model based on ultrasound image features to assess the risk of sentinel lymph node metastasis in breast cancer patients: Applications of scikit-learn and SHAP.

作者信息

Zhang Gaosen, Shi Yan, Yin Peipei, Liu Feifei, Fang Yi, Li Xiang, Zhang Qingyu, Zhang Zhen

机构信息

Department of Ultrasound, First Affiliated Hospital of China Medical University, Shenyang, China.

Department of Ultrasound, Binzhou Medical University Hospital, Binzhou, China.

出版信息

Front Oncol. 2022 Jul 25;12:944569. doi: 10.3389/fonc.2022.944569. eCollection 2022.

Abstract

BACKGROUND

This study aimed to determine an optimal machine learning (ML) model for evaluating the preoperative diagnostic value of ultrasound signs of breast cancer lesions for sentinel lymph node (SLN) status.

METHOD

This study retrospectively analyzed the ultrasound images and postoperative pathological findings of lesions in 952 breast cancer patients. Firstly, the univariate analysis of the relationship between the ultrasonographic features of breast cancer morphological features and SLN metastasis. Then, based on the ultrasound signs of breast cancer lesions, we screened ten ML models: support vector machine (SVM), extreme gradient boosting (XGBoost), random forest (RF), linear discriminant analysis (LDA), logistic regression (LR), naive bayesian model (NB), k-nearest neighbors (KNN), multilayer perceptron (MLP), long short-term memory (LSTM), and convolutional neural network (CNN). The diagnostic performance of the model was evaluated using the area under the receiver operating characteristic (ROC) curve (AUC), Kappa value, accuracy, F1-score, sensitivity, and specificity. Then we constructed a clinical prediction model which was based on the ML algorithm with the best diagnostic performance. Finally, we used SHapley Additive exPlanation (SHAP) to visualize and analyze the diagnostic process of the ML model.

RESULTS

Of 952 patients with breast cancer, 394 (41.4%) had SLN metastasis, and 558 (58.6%) had no metastasis. Univariate analysis found that the shape, orientation, margin, posterior features, calculations, architectural distortion, duct changes and suspicious lymph node of breast cancer lesions in ultrasound signs were associated with SLN metastasis. Among the 10 ML algorithms, XGBoost had the best comprehensive diagnostic performance for SLN metastasis, with Average-AUC of 0.952, Average-Kappa of 0.763, and Average-Accuracy of 0.891. The AUC of the XGBoost model in the validation cohort was 0.916, the accuracy was 0.846, the sensitivity was 0.870, the specificity was 0.862, and the F1-score was 0.826. The diagnostic performance of the XGBoost model was significantly higher than that of experienced radiologists in some cases (P<0.001). Using SHAP to visualize the interpretation of the ML model screen, it was found that the ultrasonic detection of suspicious lymph nodes, microcalcifications in the primary tumor, burrs on the edge of the primary tumor, and distortion of the tissue structure around the lesion contributed greatly to the diagnostic performance of the XGBoost model.

CONCLUSIONS

The XGBoost model based on the ultrasound signs of the primary breast tumor and its surrounding tissues and lymph nodes has a high diagnostic performance for predicting SLN metastasis. Visual explanation using SHAP made it an effective tool for guiding clinical courses preoperatively.

摘要

背景

本研究旨在确定一种最佳的机器学习(ML)模型,用于评估乳腺癌病灶超声征象对前哨淋巴结(SLN)状态的术前诊断价值。

方法

本研究回顾性分析了952例乳腺癌患者病灶的超声图像和术后病理结果。首先,对乳腺癌形态学特征的超声特征与SLN转移之间的关系进行单因素分析。然后,基于乳腺癌病灶的超声征象,筛选出10种ML模型:支持向量机(SVM)、极端梯度提升(XGBoost)、随机森林(RF)、线性判别分析(LDA)、逻辑回归(LR)、朴素贝叶斯模型(NB)、k近邻(KNN)、多层感知器(MLP)、长短期记忆(LSTM)和卷积神经网络(CNN)。使用受试者操作特征(ROC)曲线下面积(AUC)、Kappa值、准确率、F1分数、敏感性和特异性评估模型的诊断性能。然后,我们构建了一个基于诊断性能最佳的ML算法的临床预测模型。最后,我们使用SHapley加性解释(SHAP)对ML模型的诊断过程进行可视化和分析。

结果

952例乳腺癌患者中,394例(41.4%)发生SLN转移,558例(58.6%)未发生转移。单因素分析发现,超声征象中乳腺癌病灶的形状、方位、边缘、后方特征、钙化、结构扭曲、导管改变及可疑淋巴结与SLN转移有关。在10种ML算法中,XGBoost对SLN转移的综合诊断性能最佳,平均AUC为0.952,平均Kappa为0.763,平均准确率为0.891。XGBoost模型在验证队列中的AUC为0.916,准确率为0.846,敏感性为0.870,特异性为0.862,F1分数为0.826。在某些情况下,XGBoost模型的诊断性能显著高于经验丰富的放射科医生(P<0.001)。使用SHAP对ML模型筛选的解释进行可视化发现,超声检测到的可疑淋巴结、原发肿瘤内的微钙化、原发肿瘤边缘的毛刺以及病灶周围组织结构的扭曲对XGBoost模型的诊断性能贡献很大。

结论

基于原发性乳腺肿瘤及其周围组织和淋巴结超声征象的XGBoost模型对预测SLN转移具有较高的诊断性能。使用SHAP进行可视化解释使其成为术前指导临床病程的有效工具。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/adbc/9359803/d6b5497ec32a/fonc-12-944569-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验