Division of Nutrition Epidemiology and Data Science, Tufts University Friedman School of Nutrition Science and Policy, 150 Harrison Avenue, Boston, MA 02111, USA.
Gordon Institute, Tufts University School of Engineering, 200 Boston Avenue, Medford, MA 02155, USA.
Int J Environ Res Public Health. 2022 Mar 2;19(5):2898. doi: 10.3390/ijerph19052898.
Public health agencies routinely collect time-referenced records to describe and compare foodborne outbreak characteristics. Few studies provide comprehensive metadata to inform researchers of data limitations prior to conducting statistical modeling. We described the completeness of 103 variables for 22,792 outbreaks publicly reported by the United States Centers for Disease Control and Prevention’s (US CDC’s) electronic Foodborne Outbreak Reporting System (eFORS) and National Outbreak Reporting System (NORS). We compared monthly trends of completeness during eFORS (1998−2008) and NORS (2009−2019) reporting periods using segmented time series analyses adjusted for seasonality. We quantified the overall, annual, and monthly completeness as the percentage of outbreaks with blank records per our study period, calendar year, and study month, respectively. We found that outbreaks of unknown genus (n = 7401), Norovirus (n = 6414), Salmonella (n = 2872), Clostridium (n = 944), and multiple genera (n = 779) accounted for 80.77% of all outbreaks. However, crude completeness ranged from 46.06% to 60.19% across the 103 variables assessed. Variables with the lowest crude completeness (ranging 3.32−6.98%) included pathogen, specimen etiological testing, and secondary transmission traceback information. Variables with low (<35%) average monthly completeness during eFORS increased by 0.33−0.40%/month after transitioning to NORS, most likely due to the expansion of surveillance capacity and coverage within the new reporting system. Examining completeness metrics in outbreak surveillance systems provides essential information on the availability of data for public reuse. These metadata offer important insights for public health statisticians and modelers to precisely monitor and track the geographic spread, event duration, and illness intensity of foodborne outbreaks.
公共卫生机构通常会收集时间参考记录,以描述和比较食源性疾病暴发的特征。很少有研究在进行统计建模之前提供全面的元数据来告知研究人员数据的局限性。我们描述了美国疾病控制与预防中心(CDC)电子食源性疾病暴发报告系统(eFORS)和国家暴发报告系统(NORS)公开报告的 22792 起暴发事件中 103 个变量的完整性。我们使用分段时间序列分析方法比较了 eFORS(1998-2008 年)和 NORS(2009-2019 年)报告期间的完整性月度趋势,分析结果已调整季节性因素。我们量化了整体、年度和月度的完整性,分别为我们研究期间、日历年和研究月份中每起暴发事件的空白记录百分比。我们发现,未知属(n=7401)、诺如病毒(n=6414)、沙门氏菌(n=2872)、梭菌(n=944)和多种属(n=779)的暴发占所有暴发的 80.77%。然而,103 个评估变量的原始完整性范围从 46.06%到 60.19%。原始完整性最低(范围为 3.32-6.98%)的变量包括病原体、标本病因学检测和二次传播溯源信息。在 eFORS 期间,平均每月完整性<35%的变量每月增加 0.33-0.40%,在过渡到 NORS 后,很可能是由于新报告系统内监测能力和覆盖范围的扩大。检查暴发监测系统的完整性指标可为公共数据再利用提供有关数据可用性的重要信息。这些元数据为公共卫生统计学家和建模师提供了重要的见解,可帮助他们精确监测和跟踪食源性疾病暴发的地理传播、事件持续时间和疾病严重程度。