Wu Xiao, Mealli Fabrizia, Kioumourtzoglou Marianthi-Anna, Dominici Francesca, Braun Danielle
Department of Biostatistics, Mailman School of Public Health, Columbia University.
Department of Statistics, Informatics, Applications and Florence Center for Data Science, University of Florence.
J Am Stat Assoc. 2024;119(545):757-772. doi: 10.1080/01621459.2022.2144737. Epub 2022 Dec 12.
In the context of a binary treatment, matching is a well-established approach in causal inference. However, in the context of a continuous treatment or exposure, matching is still underdeveloped. We propose an innovative matching approach to estimate an average causal exposure-response function under the setting of continuous exposures that relies on the generalized propensity score (GPS). Our approach maintains the following attractive features of matching: a) clear separation between the design and the analysis; b) robustness to model misspecification or to the presence of extreme values of the estimated GPS; c) straightforward assessments of covariate balance. We first introduce an assumption of identifiability, called local weak unconfoundedness. Under this assumption and mild smoothness conditions, we provide theoretical guarantees that our proposed matching estimator attains point-wise consistency and asymptotic normality. In simulations, our proposed matching approach outperforms existing methods under settings with model misspecification or in the presence of extreme values of the estimated GPS. We apply our proposed method to estimate the average causal exposure-response function between long-term PM exposure and all-cause mortality among 68.5 million Medicare enrollees, 2000-2016. We found strong evidence of a harmful effect of long-term PM exposure on mortality. Code for the proposed matching approach is provided in the R package, which is available on CRAN and provides a computationally efficient implementation.
在二元治疗的背景下,匹配是因果推断中一种成熟的方法。然而,在连续治疗或暴露的背景下,匹配仍未得到充分发展。我们提出了一种创新的匹配方法,用于在连续暴露的情况下估计平均因果暴露-反应函数,该方法依赖于广义倾向得分(GPS)。我们的方法保持了匹配的以下吸引人的特点:a)设计与分析之间有明确的区分;b)对模型误设或估计的GPS存在极值具有鲁棒性;c)协变量平衡的直接评估。我们首先引入一个可识别性假设,称为局部弱无混杂性。在这个假设和温和的平滑条件下,我们提供了理论保证,即我们提出的匹配估计量达到逐点一致性和渐近正态性。在模拟中,在存在模型误设或估计的GPS存在极值的情况下,我们提出的匹配方法优于现有方法。我们应用我们提出的方法来估计2000 - 2016年6850万医疗保险参保人中长期PM暴露与全因死亡率之间的平均因果暴露-反应函数。我们发现了长期PM暴露对死亡率有有害影响的有力证据。R包中提供了所提出匹配方法的代码,该包可在CRAN上获取,并提供了一种计算高效的实现方式。