Zou Kelly H, Resnic Frederic S, Talos Ion-Florin, Goldberg-Zimring Daniel, Bhagwat Jui G, Haker Steven J, Kikinis Ron, Jolesz Ferenc A, Ohno-Machado Lucila
Department of Radiology, Brigham and Women's Hospital, Harvard Medical School, MIT, MA, USA.
J Biomed Inform. 2005 Oct;38(5):395-403. doi: 10.1016/j.jbi.2005.02.004. Epub 2005 Mar 9.
Medical classification accuracy studies often yield continuous data based on predictive models for treatment outcomes. A popular method for evaluating the performance of diagnostic tests is the receiver operating characteristic (ROC) curve analysis. The main objective was to develop a global statistical hypothesis test for assessing the goodness-of-fit (GOF) for parametric ROC curves via the bootstrap.
A simple log (or logit) and a more flexible Box-Cox normality transformations were applied to untransformed or transformed data from two clinical studies to predict complications following percutaneous coronary interventions (PCIs) and for image-guided neurosurgical resection results predicted by tumor volume, respectively. We compared a non-parametric with a parametric binormal estimate of the underlying ROC curve. To construct such a GOF test, we used the non-parametric and parametric areas under the curve (AUCs) as the metrics, with a resulting p value reported.
In the interventional cardiology example, logit and Box-Cox transformations of the predictive probabilities led to satisfactory AUCs (AUC=0.888; p=0.78, and AUC=0.888; p=0.73, respectively), while in the brain tumor resection example, log and Box-Cox transformations of the tumor size also led to satisfactory AUCs (AUC=0.898; p=0.61, and AUC=0.899; p=0.42, respectively). In contrast, significant departures from GOF were observed without applying any transformation prior to assuming a binormal model (AUC=0.766; p=0.004, and AUC=0.831; p=0.03), respectively.
In both studies the p values suggested that transformations were important to consider before applying any binormal model to estimate the AUC. Our analyses also demonstrated and confirmed the predictive values of different classifiers for determining the interventional complications following PCIs and resection outcomes in image-guided neurosurgery.
医学分类准确性研究通常会基于治疗结果的预测模型产生连续数据。评估诊断测试性能的一种常用方法是受试者工作特征(ROC)曲线分析。主要目的是通过自助法开发一种全局统计假设检验,用于评估参数化ROC曲线的拟合优度(GOF)。
对两项临床研究中未转换或已转换的数据分别应用简单对数(或逻辑)变换和更灵活的Box-Cox正态性变换,以预测经皮冠状动脉介入治疗(PCI)后的并发症以及通过肿瘤体积预测的影像引导神经外科手术切除结果。我们将潜在ROC曲线的非参数估计与参数双正态估计进行了比较。为构建这样的拟合优度检验,我们将曲线下的非参数和参数面积(AUC)用作指标,并报告所得的p值。
在介入心脏病学示例中,预测概率的逻辑变换和Box-Cox变换产生了令人满意的AUC(分别为AUC = 0.888;p = 0.78和AUC = 0.888;p = 0.73),而在脑肿瘤切除示例中,肿瘤大小的对数变换和Box-Cox变换也产生了令人满意的AUC(分别为AUC = 0.898;p = 0.61和AUC = 0.899;p = 0.42)。相比之下,在假设双正态模型之前未进行任何变换时,观察到明显偏离拟合优度的情况(分别为AUC = 0.766;p = 0.004和AUC = 0.831;p = 0.03)。
在两项研究中,p值表明在应用任何双正态模型估计AUC之前,变换是需要考虑的重要因素。我们的分析还证明并确认了不同分类器在确定PCI后的介入并发症以及影像引导神经外科手术切除结果方面的预测价值。