Papiol Elisabeth, Ferrer Ricard, Ruiz-Rodríguez Juan C, Díaz Emili, Zaragoza Rafael, Borges-Sa Marcio, Berrueta Julen, Gómez Josep, Bodí María, Sancho Susana, Suberviola Borja, Trefler Sandra, Rodríguez Alejandro
Intensive Care Department, Vall d'Hebron University Hospital, 08035 Barcelona, Spain.
Shock, Organ Dysfunction and Resuscitation Research Group, Vall d'Hebron Research Institute (VHIR), 08035 Barcelona, Spain.
J Clin Med. 2025 Jul 30;14(15):5383. doi: 10.3390/jcm14155383.
: The SARS-CoV-2 and influenza A (H1N1)pdm09 pandemics have resulted in high numbers of ICU admissions, with high mortality. Identifying risk factors for ICU mortality at the time of admission can help optimize clinical decision making. However, the risk factors identified may differ, depending on the type of analysis used. Our aim is to compare the risk factors and performance of a linear model (multivariable logistic regression, GLM) with a non-linear model (random forest, RF) in a large national cohort. : A retrospective analysis was performed on a multicenter database including 8902 critically ill patients with influenza A (H1N1)pdm09 or COVID-19 admitted to 184 Spanish ICUs. Demographic, clinical, laboratory, and microbiological data from the first 24 h were used. Prediction models were built using GLM and RF. The performance of the GLM was evaluated by area under the ROC curve (AUC), precision, sensitivity, and specificity, while the RF by out-of-bag (OOB) error and accuracy. In addition, in the RF, the im-portance of the variables in terms of accuracy reduction (AR) and Gini index reduction (GI) was determined. : Overall mortality in the ICU was 25.8%. Model performance was similar, with AUC = 76% for GLM, and AUC = 75.6% for RF. GLM identified 17 independent risk factors, while RF identified 19 for AR and 23 for GI. Thirteen variables were found to be important in both models. Laboratory variables such as procalcitonin, white blood cells, lactate, or D-dimer levels were not significant in GLM but were significant in RF. On the contrary, acute kidney injury and the presence of spp. were important variables in the GLM but not in the RF. : Although the performance of linear and non-linear models was similar, different risk factors were determined, depending on the model used. This alerts clinicians to the limitations and usefulness of studies limited to a single type of model.
严重急性呼吸综合征冠状病毒2型(SARS-CoV-2)和甲型H1N1流感大流行导致大量患者入住重症监护病房(ICU),死亡率很高。确定入院时ICU死亡的风险因素有助于优化临床决策。然而,根据所使用的分析类型,所确定的风险因素可能会有所不同。我们的目的是在一个大型全国队列中比较线性模型(多变量逻辑回归、广义线性模型,GLM)和非线性模型(随机森林,RF)的风险因素及性能。
对一个多中心数据库进行了回顾性分析,该数据库包含8902名入住184家西班牙ICU的甲型H1N1流感或新冠肺炎危重症患者。使用了前24小时的人口统计学、临床、实验室和微生物学数据。使用GLM和RF构建预测模型。通过ROC曲线下面积(AUC)、精度、敏感性和特异性评估GLM的性能,而通过袋外(OOB)误差和准确性评估RF的性能。此外,在RF中,确定了变量在降低准确性(AR)和降低基尼指数(GI)方面的重要性。
ICU的总体死亡率为25.8%。模型性能相似,GLM的AUC = 76%,RF的AUC = 75.6%。GLM确定了17个独立风险因素,而RF确定了19个AR风险因素和23个GI风险因素。发现13个变量在两个模型中都很重要。降钙素原、白细胞、乳酸或D-二聚体水平等实验室变量在GLM中不显著,但在RF中显著。相反,急性肾损伤和某种细菌的存在是GLM中的重要变量,但在RF中不是。
虽然线性和非线性模型的性能相似,但根据所使用的模型,确定了不同的风险因素。这提醒临床医生注意仅限于单一类型模型的研究的局限性和实用性。