Fishe Jennifer, Pan Jinqian, Fedele David, Lyu Mengxian, Henson Morgan, Menze Nolan, Scott Taylor, Munson Taylor, Carberry Meghan, Dyer Abigail, Ketola Kelli, Wu Yonghui, Xu Jie
University of Florida.
Res Sq. 2025 May 9:rs.3.rs-6370010. doi: 10.21203/rs.3.rs-6370010/v1.
Pediatric asthma is one of the most common chronic diseases of childhood. Reliable identification of pediatric asthma patients in electronic health records (EHRs) is essential for both research and clinical care. However, existing computable phenotypes (CPs) exhibit varying effectiveness. This study aims to evaluate current CPs and develop a new CP, named COMPAC (COMputable Phenotype for Asthma in Children), to improve EHR-based identification of pediatric asthma patients.
Multiple CP rules were designed using various combinations of diagnosis codes, prescriptions, and clinical note text. A cohort from the University of Florida Integrated Data Repository (IDR) was used for validation through manual chart reviews. Performance was assessed using standard metrics and compared to existing CPs. Additionally, bootstrapping and demographic subgroup analyses were conducted to compare the performance of the new COMPAC to previously published CPs.
COMPAC demonstrated improved case identification compared to existing CPs, with high sensitivity (0.728; 95% confidence interval [CI]: 0.607-0.864), positive predictive value (0.886; 95% CI: 0.737-1.0), and an overall F1 score of 0.797 (95% CI: 0.682-0.90). Notably, COMPAC outperformed two previously published CPs in terms of F1 score. Performance varied across demographic subgroups, with COMPAC showing the best results in males, non-Hispanic Whites, and the 6-12 year-old age group, though its performance was lower in the 2-5 year-old age range.
COMPAC offers an improved approach for pediatric asthma case identification in EHRs. However, further validation across different sites and refinement to capture a broader range of clinical presentations are necessary to optimize its sensitivity and specificity.
儿童哮喘是儿童期最常见的慢性病之一。在电子健康记录(EHR)中可靠识别儿童哮喘患者对研究和临床护理都至关重要。然而,现有的可计算表型(CP)有效性各异。本研究旨在评估当前的CP,并开发一种名为COMPAC(儿童哮喘可计算表型)的新CP,以改进基于EHR的儿童哮喘患者识别。
使用诊断代码、处方和临床记录文本的各种组合设计了多个CP规则。来自佛罗里达大学综合数据存储库(IDR)的队列通过人工病历审查进行验证。使用标准指标评估性能,并与现有CP进行比较。此外,进行了自抽样和人口统计学亚组分析,以比较新的COMPAC与先前发表的CP的性能。
与现有CP相比,COMPAC在病例识别方面表现更佳,具有高敏感性(0.728;95%置信区间[CI]:0.607 - 0.864)、阳性预测值(0.886;95%CI:0.737 - 1.0),总体F1分数为0.797(95%CI:0.682 - 0.90)。值得注意的是,在F1分数方面,COMPAC优于两个先前发表的CP。性能在不同人口统计学亚组中有所差异,COMPAC在男性、非西班牙裔白人以及6 - 12岁年龄组中显示出最佳结果,不过在2 - 5岁年龄范围内其性能较低。
COMPAC为在EHR中识别儿童哮喘病例提供了一种改进方法。然而,需要在不同地点进行进一步验证并进行优化,以涵盖更广泛的临床表现,从而优化其敏感性和特异性。