Suppr超能文献

基于 Bootstrap 的校正方法在多变量临床预测模型构建中的校正效能再评价。

Re-evaluation of the comparative effectiveness of bootstrap-based optimism correction methods in the development of multivariable clinical prediction models.

机构信息

Department of Statistical Science, School of Multidisciplinary Sciences, The Graduate University for Advanced Studies, Tokyo, Japan.

Office of Biostatistics, Department of Biometrics, Headquarters of Clinical Development, Otsuka Pharmaceutical Co., Ltd., Tokyo, Japan.

出版信息

BMC Med Res Methodol. 2021 Jan 7;21(1):9. doi: 10.1186/s12874-020-01201-w.

Abstract

BACKGROUND

Multivariable prediction models are important statistical tools for providing synthetic diagnosis and prognostic algorithms based on patients' multiple characteristics. Their apparent measures for predictive accuracy usually have overestimation biases (known as 'optimism') relative to the actual performances for external populations. Existing statistical evidence and guidelines suggest that three bootstrap-based bias correction methods are preferable in practice, namely Harrell's bias correction and the .632 and .632+ estimators. Although Harrell's method has been widely adopted in clinical studies, simulation-based evidence indicates that the .632+ estimator may perform better than the other two methods. However, these methods' actual comparative effectiveness is still unclear due to limited numerical evidence.

METHODS

We conducted extensive simulation studies to compare the effectiveness of these three bootstrapping methods, particularly using various model building strategies: conventional logistic regression, stepwise variable selections, Firth's penalized likelihood method, ridge, lasso, and elastic-net regression. We generated the simulation data based on the Global Utilization of Streptokinase and Tissue plasminogen activator for Occluded coronary arteries (GUSTO-I) trial Western dataset and considered how event per variable, event fraction, number of candidate predictors, and the regression coefficients of the predictors impacted the performances. The internal validity of C-statistics was evaluated.

RESULTS

Under relatively large sample settings (roughly, events per variable ≥ 10), the three bootstrap-based methods were comparable and performed well. However, all three methods had biases under small sample settings, and the directions and sizes of biases were inconsistent. In general, Harrell's and .632 methods had overestimation biases when event fraction become lager. Besides, .632+ method had a slight underestimation bias when event fraction was very small. Although the bias of the .632+ estimator was relatively small, its root mean squared error (RMSE) was comparable or sometimes larger than those of the other two methods, especially for the regularized estimation methods.

CONCLUSIONS

In general, the three bootstrap estimators were comparable, but the .632+ estimator performed relatively well under small sample settings, except when the regularized estimation methods are adopted.

摘要

背景

多变量预测模型是基于患者多个特征提供综合诊断和预后算法的重要统计工具。它们的明显预测精度指标通常相对于外部人群的实际表现存在高估偏差(称为“乐观性”)。现有的统计证据和指南表明,在实践中,三种基于 bootstrap 的偏差校正方法更可取,即 Harrell 的偏差校正和.632 和.632+估计量。虽然 Harrell 方法已在临床研究中广泛采用,但基于模拟的证据表明,.632+估计量的性能可能优于其他两种方法。然而,由于数值证据有限,这些方法的实际比较效果仍不清楚。

方法

我们进行了广泛的模拟研究,以比较这三种引导法的有效性,特别是使用各种模型构建策略:传统的逻辑回归、逐步变量选择、Firth 的惩罚似然法、岭回归、lasso 和弹性网络回归。我们基于 Global Utilization of Streptokinase and Tissue plasminogen activator for Occluded coronary arteries(GUSTO-I)试验西方数据集生成模拟数据,并考虑了事件变量比、事件分数、候选预测因子数量以及预测因子的回归系数如何影响性能。内部 C 统计量的有效性进行了评估。

结果

在相对较大的样本设置(大致为每个变量的事件数≥10)下,三种基于 bootstrap 的方法具有可比性且表现良好。然而,在小样本设置下,所有三种方法都存在偏差,偏差的方向和大小不一致。一般来说,当事件分数变大时,Harrell 和.632 方法存在高估偏差。此外,当事件分数非常小时,.632+方法存在轻微的低估偏差。尽管.632+估计量的偏差相对较小,但它的均方根误差(RMSE)与其他两种方法相当,有时甚至更大,尤其是对于正则化估计方法。

结论

一般来说,三种引导估计量具有可比性,但.632+估计量在小样本设置下表现相对较好,除非采用正则化估计方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cbbb/7789544/4bc16ee6d5e4/12874_2020_1201_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验