Suppr超能文献

在预测翻修关节成形术方面,机器学习的表现并未优于传统的竞争风险模型。

Machine Learning Did Not Outperform Conventional Competing Risk Modeling to Predict Revision Arthroplasty.

作者信息

Oosterhoff Jacobien H F, de Hond Anne A H, Peters Rinne M, van Steenbergen Liza N, Sorel Juliette C, Zijlstra Wierd P, Poolman Rudolf W, Ring David, Jutte Paul C, Kerkhoffs Gino M M J, Putter Hein, Steyerberg Ewout W, Doornberg Job N

机构信息

Amsterdam UMC, University of Amsterdam, Department of Orthopedic Surgery and Sports Medicine, Amsterdam, the Netherlands.

Department of Engineering Systems and Services, Faculty of Technology Policy and Management, Delft University of Technology, Delft, the Netherlands.

出版信息

Clin Orthop Relat Res. 2024 Aug 1;482(8):1472-1482. doi: 10.1097/CORR.0000000000003018. Epub 2024 Mar 12.

Abstract

BACKGROUND

Estimating the risk of revision after arthroplasty could inform patient and surgeon decision-making. However, there is a lack of well-performing prediction models assisting in this task, which may be due to current conventional modeling approaches such as traditional survivorship estimators (such as Kaplan-Meier) or competing risk estimators. Recent advances in machine learning survival analysis might improve decision support tools in this setting. Therefore, this study aimed to assess the performance of machine learning compared with that of conventional modeling to predict revision after arthroplasty.

QUESTION/PURPOSE: Does machine learning perform better than traditional regression models for estimating the risk of revision for patients undergoing hip or knee arthroplasty?

METHODS

Eleven datasets from published studies from the Dutch Arthroplasty Register reporting on factors associated with revision or survival after partial or total knee and hip arthroplasty between 2018 and 2022 were included in our study. The 11 datasets were observational registry studies, with a sample size ranging from 3038 to 218,214 procedures. We developed a set of time-to-event models for each dataset, leading to 11 comparisons. A set of predictors (factors associated with revision surgery) was identified based on the variables that were selected in the included studies. We assessed the predictive performance of two state-of-the-art statistical time-to-event models for 1-, 2-, and 3-year follow-up: a Fine and Gray model (which models the cumulative incidence of revision) and a cause-specific Cox model (which models the hazard of revision). These were compared with a machine-learning approach (a random survival forest model, which is a decision tree-based machine-learning algorithm for time-to-event analysis). Performance was assessed according to discriminative ability (time-dependent area under the receiver operating curve), calibration (slope and intercept), and overall prediction error (scaled Brier score). Discrimination, known as the area under the receiver operating characteristic curve, measures the model's ability to distinguish patients who achieved the outcomes from those who did not and ranges from 0.5 to 1.0, with 1.0 indicating the highest discrimination score and 0.50 the lowest. Calibration plots the predicted versus the observed probabilities; a perfect plot has an intercept of 0 and a slope of 1. The Brier score calculates a composite of discrimination and calibration, with 0 indicating perfect prediction and 1 the poorest. A scaled version of the Brier score, 1 - (model Brier score/null model Brier score), can be interpreted as the amount of overall prediction error.

RESULTS

Using machine learning survivorship analysis, we found no differences between the competing risks estimator and traditional regression models for patients undergoing arthroplasty in terms of discriminative ability (patients who received a revision compared with those who did not). We found no consistent differences between the validated performance (time-dependent area under the receiver operating characteristic curve) of different modeling approaches because these values ranged between -0.04 and 0.03 across the 11 datasets (the time-dependent area under the receiver operating characteristic curve of the models across 11 datasets ranged between 0.52 to 0.68). In addition, the calibration metrics and scaled Brier scores produced comparable estimates, showing no advantage of machine learning over traditional regression models.

CONCLUSION

Machine learning did not outperform traditional regression models.

CLINICAL RELEVANCE

Neither machine learning modeling nor traditional regression methods were sufficiently accurate in order to offer prognostic information when predicting revision arthroplasty. The benefit of these modeling approaches may be limited in this context.

摘要

背景

评估关节置换术后翻修风险有助于患者和外科医生做出决策。然而,目前缺乏性能良好的预测模型来辅助这项任务,这可能是由于当前的传统建模方法,如传统的生存估计器(如Kaplan-Meier)或竞争风险估计器。机器学习生存分析的最新进展可能会改善这种情况下的决策支持工具。因此,本研究旨在评估机器学习与传统建模在预测关节置换术后翻修方面的性能。

问题/目的:在估计接受髋或膝关节置换术患者的翻修风险方面,机器学习的表现是否优于传统回归模型?

方法

我们纳入了荷兰关节置换登记处已发表研究中的11个数据集,这些研究报告了2018年至2022年间部分或全膝关节和髋关节置换术后与翻修或生存相关的因素。这11个数据集为观察性登记研究,样本量从3038例到218,214例手术不等。我们为每个数据集开发了一组事件发生时间模型,从而进行11次比较。根据纳入研究中选择的变量确定了一组预测因子(与翻修手术相关的因素)。我们评估了两种先进的统计事件发生时间模型在1年、2年和3年随访时的预测性能:Fine and Gray模型(用于模拟翻修的累积发生率)和特定病因Cox模型(用于模拟翻修的风险)。将这些模型与一种机器学习方法(随机生存森林模型,一种基于决策树的用于事件发生时间分析的机器学习算法)进行比较。根据判别能力(受试者工作特征曲线下的时间依赖性面积)、校准(斜率和截距)和总体预测误差(标准化Brier评分)来评估性能。判别能力,即受试者工作特征曲线下的面积,衡量模型区分达到结果的患者和未达到结果的患者的能力,范围从0.5到1.0,1.0表示最高判别分数,0.50表示最低。校准图绘制预测概率与观察概率;理想的图截距为0,斜率为1。Brier评分计算判别和校准的综合指标,0表示完美预测,1表示最差。Brier评分的标准化版本,1 - (模型Brier评分/空模型Brier评分),可解释为总体预测误差的大小。

结果

使用机器学习生存分析,我们发现对于接受关节置换术的患者,在判别能力(接受翻修的患者与未接受翻修的患者相比)方面,竞争风险估计器与传统回归模型之间没有差异。我们发现不同建模方法的验证性能(受试者工作特征曲线下的时间依赖性面积)之间没有一致的差异,因为在11个数据集中这些值在-0.04至0.03之间(11个数据集中模型的受试者工作特征曲线下的时间依赖性面积在0.52至0.68之间)。此外,校准指标和标准化Brier评分产生了可比的估计值,表明机器学习相对于传统回归模型没有优势。

结论

机器学习的表现并不优于传统回归模型。

临床意义

在预测翻修关节置换术时,机器学习建模和传统回归方法都不够准确,无法提供预后信息。在这种情况下,这些建模方法的益处可能有限。

相似文献

1
Machine Learning Did Not Outperform Conventional Competing Risk Modeling to Predict Revision Arthroplasty.
Clin Orthop Relat Res. 2024 Aug 1;482(8):1472-1482. doi: 10.1097/CORR.0000000000003018. Epub 2024 Mar 12.
3
Are Current Survival Prediction Tools Useful When Treating Subsequent Skeletal-related Events From Bone Metastases?
Clin Orthop Relat Res. 2024 Sep 1;482(9):1710-1721. doi: 10.1097/CORR.0000000000003030. Epub 2024 Mar 22.
7
What Are the Functional, Radiographic, and Survivorship Outcomes of a Modified Cup-cage Technique for Pelvic Discontinuity?
Clin Orthop Relat Res. 2024 Dec 1;482(12):2149-2160. doi: 10.1097/CORR.0000000000003186. Epub 2024 Jul 9.
10
What Factors Are Associated With Implant Revision in the Treatment of Pathologic Subtrochanteric Femur Fractures?
Clin Orthop Relat Res. 2025 Mar 1;483(3):473-484. doi: 10.1097/CORR.0000000000003291. Epub 2024 Oct 22.

引用本文的文献

1
Risk prediction models for renal injury in children with IgA vasculitis: a systematic review and meta-analysis.
Pediatr Rheumatol Online J. 2025 Jul 28;23(1):80. doi: 10.1186/s12969-025-01120-4.
2
The Emergence of Applied Artificial Intelligence in the Realm of Value Based Musculoskeletal Care.
Curr Rev Musculoskelet Med. 2025 Jun 14. doi: 10.1007/s12178-025-09982-7.
3
CORR Insights®: Machine Learning Did Not Outperform Conventional Competing Risk Modeling to Predict Revision Arthroplasty.
Clin Orthop Relat Res. 2024 Aug 1;482(8):1483-1484. doi: 10.1097/CORR.0000000000003046. Epub 2024 Mar 19.

本文引用的文献

1
Limited clinical utility of a machine learning revision prediction model based on a national hip arthroscopy registry.
Knee Surg Sports Traumatol Arthrosc. 2023 Jun;31(6):2079-2089. doi: 10.1007/s00167-022-07054-8. Epub 2022 Aug 10.
4
5
Random survival forests for dynamic predictions of a time-to-event outcome using a longitudinal biomarker.
BMC Med Res Methodol. 2021 Oct 17;21(1):216. doi: 10.1186/s12874-021-01375-x.
10

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验