Li Mianmian, Su Xinhui, Liao Wenxin, Huang Li, Yang Yihong, Wu Xizi, Fan Yao, Liu Jing, Yang Xin, Zeng Zhen, Ding Wencheng, Zeng Wanjiang, Xu Xiaoyan
Department of Obstetrics and Gynecology, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430030, Hubei, China.
Reprod Sci. 2025 Aug 12. doi: 10.1007/s43032-025-01937-0.
The objective of this study is to predict the occurrence of postpartum hemorrhage in women with placenta previa based on machine learning. This retrospective study enrolled 845 singleton pregnant patients with placenta previa from two hospitals. They were allocated into a training cohort (n = 403), a testing cohort (n = 174), and the external validation cohort (n = 268). Univariate and multivariate regression analyses were employed to select clinical variables (p < 0.05), which were subsequently utilized to develop 11 machine learning prediction models. The area under the receiver operating characteristic curve (AUC), decision curve analysis (DCA), accuracy (ACC), sensitivity (SEN), and specificity (SPE) were used to evaluate the performance of the models. Besides, SHapley Additive exPlanations (SHAP) was used to interpret the role and effectiveness of variables in the predictive model. Three machine learning models with the best predictive performance were combined into a Prediction Ensemble Classifier through voting. The Gradient Boosting Machine demonstrated the best predictive performance. In the validation cohort, AUC of the Gradient Boosting Machine model is 0.810(95% CI 0.754-0.865), ACC was 0.765(95% CI 0.716-0.813), SEN was 0.613(95% CI 0.513-0.723), while these values of the Prediction Ensemble Classifier were 0.813(0.756-0.871), 0.806(0.757-0.854), and 0.480(0.375-0.597), respectively. The importance of SHAP variables in the model, ranked from high to low, is as follows: d-dimer, ultrasound diagnosis of placenta accreta spectrum, neutrophils, prothrombin time, and platelets. The Gradient Boosting Machine model demonstrated excellent performance in predicting postpartum hemorrhage in cases of placenta previa. Furthermore, SHAP analysis enabled interpretation of the variables in the model.
本研究的目的是基于机器学习预测前置胎盘孕妇产后出血的发生情况。这项回顾性研究纳入了来自两家医院的845名单胎前置胎盘孕妇。她们被分为训练队列(n = 403)、测试队列(n = 174)和外部验证队列(n = 268)。采用单因素和多因素回归分析来选择临床变量(p < 0.05),随后利用这些变量开发11种机器学习预测模型。采用受试者操作特征曲线下面积(AUC)、决策曲线分析(DCA)、准确率(ACC)、灵敏度(SEN)和特异度(SPE)来评估模型的性能。此外,使用SHapley加性解释(SHAP)来解释变量在预测模型中的作用和有效性。通过投票将三个预测性能最佳的机器学习模型组合成一个预测集成分类器。梯度提升机表现出最佳的预测性能。在验证队列中,梯度提升机模型的AUC为0.810(95%CI 0.754 - 0.865),ACC为0.765(95%CI 0.716 - 0.813),SEN为0.613(95%CI 0.513 - 0.723),而预测集成分类器的这些值分别为0.813(0.756 - 0.871)、0.806(0.757 - 0.854)和0.480(0.375 - 0.597)。模型中SHAP变量的重要性从高到低依次为:D - 二聚体、胎盘植入谱系超声诊断、中性粒细胞、凝血酶原时间和血小板。梯度提升机模型在预测前置胎盘病例的产后出血方面表现出优异的性能。此外,SHAP分析能够解释模型中的变量。