Hoek G, Bouma F, Janssen N, Wesseling J, van Ratingen S, Kerckhoffs J, Gehring U, Hendricx W, Vermeulen R, de Hoogh K
Institute for Risk Assessment Sciences, Utrecht University, the Netherlands.
National Institute for Public Health and the Environment (RIVM), Bilthoven, the Netherlands.
Res Rep Health Eff Inst. 2025 Mar(226):1-101.
Assessment of long-term exposure to outdoor air pollution remains a major challenge for epidemiological studies. One of these challenges is characterizing fine-scale spatial variation of the ambient concentrations of key traffic-related air pollutants - including ultrafine particles (UFPs), black carbon (BC), and nitrogen dioxide (NO). Epidemiological studies have used widely different approaches to address these challenges, including empirical land use regression (LUR) models based on fixed-site routine or targeted monitoring, low-cost sensor networks, mobile monitoring, and deterministic dispersion models. Little information is available about the relative performance of these different approaches for assessing long-term exposure to traffic-related air pollution. Different methods may result in heterogeneity in health effect estimates from epidemiological studies applying different exposure-assessment approaches.
The Specific Aims of the study.
We assessed UFPs, NO, BC, and particulate matter ≤2.5 μm in aerodynamic diameter (PM).
We evaluated annual average air pollution concentrations across the Netherlands using a suite of different exposure models, which differed in modeling approach (empirical LUR, deterministic dispersion models) and monitoring data used (low-cost sensors, mobile monitoring, nationwide and Europewide routine monitoring, and study-specific targeted monitoring). For empirical models, we tested three model development algorithms: supervised linear regression (SLR), Random Forest, and least absolute shrinkage and selection operator (LASSO). The predictions of the models were compared at 20,000 addresses across the Netherlands. The performance was also tested on external validation data, which were obtained from a new campaign (2021-2023) and existing data from different years, allowing assessment of how well recent models predict past air pollution exposure. Epidemiological analyses in three cohort studies were conducted to compare health effect estimates of the different exposure models. We assessed associations of air pollution in a national administrative cohort with natural-cause and cause-specific mortality, in a cohort study that had detailed lifestyle data with natural-cause mortality and incidence of stroke and coronary events, and in a mature birth cohort with lung function and asthma incidence.
Exposure predictions at residential sites from the dispersion model and the Europewide hybrid LUR models were available for multiple years in the period 2010-2019. For these models, exposure predictions of different years in the period 2010-2019 were highly correlated for BC, NO, and PM (Correlation coefficient R > 0.9). Consistently, the year of the exposure model did not affect the presence of an association with mortality and morbidity. Small differences in hazard ratios (HR) were related to exposure contrast for different years. The HR for the association of NO with natural-cause mortality was 1.026 (95% confidence interval [CI]: 1.022-1.031) for the 2010 exposure estimate and 1.030 (1.024-1.035) for the 2019 exposure estimate of the Europewide LUR model, expressed per 10 µg/m.
The exposure models generally resulted in highly to moderately correlated exposure predictions at residential sites across the Netherlands (R > 0.7 for BC, NO, and UFPs; R > 0.5 for PM). The predicted level of exposure and exposure contrast could differ substantially between models and algorithms within models; for example, the interquartile range (IQR) for BC for each of the various models at the 20,000 residential locations ranged between 0.1 and 2.2 µg/m. Mobile monitoring studies generally resulted in modestly higher BC concentrations and exposure contrasts compared to other exposure models. Small differences were found between the different models in explaining the spatial variation of air pollution concentrations at the new and existing validation sites. Models explained historical exposure patterns at external sites covering more than 10 years moderately well, especially for BC (R > 0.7) and NO (R > 0.7), and moderately so for UFPs (R > 0.5). Most models predicted the small concentration contrasts of PM relatively poorly.
Consistent with the high correlation of the different exposure models, the application of these models generally resulted in similar conclusions on the presence of associations with natural-cause, respiratory, and lung cancer mortality in the large nationwide cohort, and with asthma incidence and lung function in the birth cohort. However, the effect estimates differed substantially; for example, the HR for natural-cause mortality in the nationwide administrative cohort for a 1 µg/m increase in BC ranged from 1.01 (95% CI: 0.99-1.02) to 1.09 (1.07-1.10). For the outcomes with small effect estimates and the smaller cohort studies, differences in conclusions related to the exposure assessment method were more distinct.
Differences in exposure assessment may contribute substantially to the observed heterogeneity of effect estimates in systematic reviews of epidemiological studies. High heterogeneity was indicated by the commonly used heterogeneity measure I, where the value was above 80% for a meta-analysis of the different effect estimates for natural-cause mortality in the nationwide cohort.
Validation of long-term exposure models for the nonroutinely monitored pollutants BC and especially UFPs was challenging, despite generally successful monitoring. The new external validation monitoring campaign resulted in rather unstable estimates of the long-term average spatial contrast, both across sites and where affected by temporal variation, especially for BC and PM.
No consistent differences were found in the model performance of SLR, Random Forest, and LASSO, both in internal cross-validation of model building and on external validation sites not used in model building. Exposure predictions from the three algorithms were generally highly correlated and resulted in similar associations with health. However, for individual models, occasionally large differences were found in exposure contrast, validation statistics, and associations with mortality and morbidity outcomes.
There was little benefit in using low-cost sensors for NO and PM. The addition of low-cost sensor data did not improve NO estimates in models that combined dispersion model estimates and data from the national monitoring network data.
The main conclusions of the project.
• Exposure predictions of BC, NO, and PM for different years between 2010-2019 were highly correlated, documenting stable spatial contrast patterns. Consistently, the year of the exposure model did not affect the presence of an association with mortality and morbidity outcomes.
• Models explained historical exposure patterns at external sites covering more than 10 years moderately well, especially for BC.
• Different exposure models generally resulted in highly to moderately correlated exposure predictions. The predicted level of exposure and exposure contrast could differ substantially between models. Small differences were found between the different models in explaining spatial variation at validation sites.
• Application of different exposure models resulted in similar conclusions about the presence of associations with health outcomes, but effect estimates differed substantially in magnitude between individual exposure models. No consistent differences in effect estimates were found between groups of mobile, dispersion, and fixed-site LUR models.
• Differences in exposure models may therefore contribute substantially to the observed heterogeneity of effect estimates in systematic reviews of epidemiological studies. Factors that explained some of the heterogeneity of effect estimates included the performance of the model at external validation sites and the predicted exposure contrast.
• Exposure predictions from the three algorithms were generally highly correlated and resulted in similar associations with health. No consistent differences were found in their model performances.
评估长期暴露于室外空气污染对流行病学研究而言仍是一项重大挑战。其中一项挑战是描述与交通相关的关键空气污染物(包括超细颗粒物(UFPs)、黑碳(BC)和二氧化氮(NO))环境浓度的精细空间变化。流行病学研究采用了广泛不同的方法来应对这些挑战,包括基于固定站点常规或针对性监测的经验性土地利用回归(LUR)模型、低成本传感器网络、移动监测和确定性扩散模型。关于这些不同方法在评估与交通相关空气污染的长期暴露方面的相对性能,现有信息较少。不同方法可能导致应用不同暴露评估方法的流行病学研究在健康效应估计方面存在异质性。
该研究的具体目标。
我们评估了超细颗粒物、一氧化氮、黑碳以及空气动力学直径≤2.5μm的颗粒物(PM)。
我们使用了一系列不同的暴露模型评估了荷兰各地的年平均空气污染浓度,这些模型在建模方法(经验性土地利用回归、确定性扩散模型)和所使用的监测数据(低成本传感器、移动监测、全国和全欧洲范围的常规监测以及特定研究的针对性监测)方面存在差异。对于经验模型,我们测试了三种模型开发算法:监督线性回归(SLR)、随机森林和最小绝对收缩和选择算子(LASSO)。在荷兰的20000个地址对模型的预测结果进行了比较。还在外部验证数据上测试了模型性能,这些数据来自一项新的活动(2021 - 2023年)以及不同年份的现有数据,从而能够评估近期模型对过去空气污染暴露的预测效果如何。在三项队列研究中进行了流行病学分析,以比较不同暴露模型的健康效应估计值。我们评估了全国行政队列中空气污染与自然原因和特定病因死亡率的关联,在一项拥有详细生活方式数据的队列研究中评估了空气污染与自然原因死亡率以及中风和冠心病事件发生率的关联,在一个成熟的出生队列中评估了空气污染与肺功能和哮喘发病率的关联。
在2010 - 2019年期间,扩散模型和全欧洲范围的混合土地利用回归模型对居住地点的暴露预测可获取多年数据。对于这些模型,2010 - 2019年期间不同年份的黑碳(BC)、一氧化氮(NO)和颗粒物(PM)暴露预测高度相关(相关系数R > 0.9)。同样,暴露模型的年份并不影响与死亡率和发病率关联的存在。危险比(HR)的微小差异与不同年份的暴露对比有关。全欧洲范围土地利用回归模型中,2010年暴露估计值中一氧化氮与自然原因死亡率的关联危险比为1.026(95%置信区间[CI]:1.022 - 1.031),2019年暴露估计值中该关联危险比为1.030(1.024 - 1.035),以每10μg/m表示。
暴露模型通常在荷兰各地居住地点产生高度到中度相关的暴露预测(黑碳、一氧化氮和超细颗粒物的R > 0.7;颗粒物的R > 0.5)。不同模型以及模型内的算法之间,预测的暴露水平和暴露对比可能存在很大差异;例如,在20000个居住地点,各种模型中黑碳的四分位距(IQR)在0.1至2.2μg/m之间。与其他暴露模型相比,移动监测研究通常导致略高的黑碳浓度和暴露对比。在解释新的和现有的验证地点空气污染浓度的空间变化方面,不同模型之间发现了微小差异。模型对覆盖超过10年的外部地点的历史暴露模式解释得较好,特别是对于黑碳(R > 0.7)和一氧化氮(R > 0.7),对于超细颗粒物解释程度适中(R > 0.5)。大多数模型对颗粒物的小浓度对比预测相对较差。
与不同暴露模型的高度相关性一致,这些模型的应用通常在大型全国队列中关于与自然原因、呼吸系统和肺癌死亡率的关联,以及在出生队列中与哮喘发病率和肺功能的关联方面得出相似结论。然而,效应估计值差异很大;例如,在全国行政队列中,黑碳每增加1μg/m,自然原因死亡率的危险比范围从1.01(95% CI:0.99 - 1.02)到1.09(1.07 - 1.10)。对于效应估计值较小的结果和规模较小的队列研究,与暴露评估方法相关的结论差异更为明显。
暴露评估的差异可能在很大程度上导致了流行病学研究系统评价中观察到的效应估计值的异质性。常用的异质性度量I表明存在高度异质性,在对全国队列中自然原因死亡率的不同效应估计值进行荟萃分析时,该值高于80%。
对非常规监测的污染物黑碳尤其是超细颗粒物的长期暴露模型进行验证具有挑战性,尽管监测总体上是成功的。新的外部验证监测活动导致长期平均空间对比的估计相当不稳定,无论是在不同地点之间还是受时间变化影响时,特别是对于黑碳和颗粒物。
在模型构建的内部交叉验证以及未用于模型构建的外部验证地点,监督线性回归(SLR)、随机森林和最小绝对收缩和选择算子(LASSO)的模型性能未发现一致差异。这三种算法的暴露预测通常高度相关,并导致与健康的相似关联。然而,对于个别模型,在暴露对比、验证统计以及与死亡率和发病率结果的关联方面偶尔会发现很大差异。
使用低成本传感器对一氧化氮和颗粒物几乎没有益处。在结合扩散模型估计值和国家监测网络数据的模型中,添加低成本传感器数据并未改善一氧化氮的估计。
该项目的主要结论。
• 2010 - 2019年期间不同年份的黑碳、一氧化氮和颗粒物的暴露预测高度相关,记录了稳定的空间对比模式。同样,暴露模型的年份并不影响与死亡率和发病率结果关联的存在。
• 模型对覆盖超过10年的外部地点的历史暴露模式解释得较好,特别是对于黑碳。
• 不同暴露模型通常产生高度到中度相关的暴露预测。不同模型之间预测的暴露水平和暴露对比可能存在很大差异。在解释验证地点的空间变化方面,不同模型之间发现了微小差异。
• 应用不同暴露模型在与健康结果的关联存在方面得出了相似结论,但个别暴露模型之间的效应估计值在大小上差异很大。在移动、扩散和固定站点土地利用回归模型组之间,未发现效应估计值的一致差异。
• 因此,暴露模型的差异可能在很大程度上导致了流行病学研究系统评价中观察到的效应估计值的异质性。解释效应估计值部分异质性的因素包括模型在外部验证地点的性能以及预测的暴露对比。
• 这三种算法的暴露预测通常高度相关,并导致与健康的相似关联。未发现它们的模型性能存在一致差异。