Song Jingwei, Ma Ni, Aini Reziwanguli, Yang Yuqing
School of Public Health, Xinjiang Medical University Urumqi 830017, Xinjiang, China.
People's Hospital of Xinjiang Uygur Autonomous Region Urumqi 830001, Xinjiang, China.
Am J Transl Res. 2025 Jul 15;17(7):4939-4951. doi: 10.62347/KEVQ8263. eCollection 2025.
Chronic Hepatitis B (CHB) is a leading cause of liver fibrosis. Accurate and non-invasive diagnosis of liver fibrosis in CHB patients is of critical clinical importance. This study aimed to develop and validate machine learning (ML)-based models for predicting significant liver fibrosis in CHB patients.
This retrospective cohort study included 328 CHB patients (225 with non-significant liver fibrosis and 103 with significant liver fibrosis) from 2017 to 2022. Four ML models were constructed based on four selected features identified through the least absolute shrinkage and selection operator (LASSO) regression. Model performance was assessed using the receiver operating characteristic (ROC) curve, and the area under the curve (AUC), accuracy, sensitivity, specificity, and SHapley Additive exPlanations (SHAP) analysis.
The random forest (RF) model demonstrated the highest predictive performance, with an AUC of 0.874 (95% CI: 0.813-0.934) in the training set and 0.863 (95% CI: 0.772-0.955) in the test set, outperforming extreme gradient boosting (XGBoost), logistic regression (LR), and support vector machine (SVM). Compared with the traditional fibrosis indices such as aspartate aminotransferase to platelet ratio index (APRI) (AUC = 0.585) and fibrosis-4 (FIB-4) (AUC = 0.633), the RF model (AUC = 0.863) demonstrated significantly higher predictive accuracy. SHAP analysis identified platelet count (PLT) as the most influential predictor in the RF model.
The ML-based RF model offers a highly accurate, non-invasive interpretable tool for predicting significant liver fibrosis in patients with CHB, offering potential for clinical application in routine fibrosis risk assessment.
慢性乙型肝炎(CHB)是肝纤维化的主要病因。准确且无创地诊断CHB患者的肝纤维化具有至关重要的临床意义。本研究旨在开发并验证基于机器学习(ML)的模型,用于预测CHB患者的显著肝纤维化。
这项回顾性队列研究纳入了2017年至2022年的328例CHB患者(225例无显著肝纤维化,103例有显著肝纤维化)。基于通过最小绝对收缩和选择算子(LASSO)回归确定的四个选定特征构建了四个ML模型。使用受试者工作特征(ROC)曲线、曲线下面积(AUC)、准确性、敏感性、特异性和SHapley加性解释(SHAP)分析来评估模型性能。
随机森林(RF)模型表现出最高的预测性能,训练集的AUC为0.874(95%CI:0.813 - 0.934),测试集的AUC为0.863(95%CI:0.772 - 0.955),优于极端梯度提升(XGBoost)、逻辑回归(LR)和支持向量机(SVM)。与传统纤维化指标如天冬氨酸转氨酶与血小板比值指数(APRI)(AUC = 0.585)和纤维化-4(FIB-4)(AUC = 0.633)相比,RF模型(AUC = 0.863)显示出显著更高的预测准确性。SHAP分析确定血小板计数(PLT)是RF模型中最具影响力的预测因子。
基于ML的RF模型为预测CHB患者的显著肝纤维化提供了一种高度准确、无创且可解释的工具,在常规纤维化风险评估的临床应用中具有潜力。