Institute of Behavioral Science, University of Colorado Boulder, CO, 1440 15th 80309, St., Boulder, USA.
Prev Sci. 2021 Nov;22(8):1159-1172. doi: 10.1007/s11121-021-01263-2. Epub 2021 Jun 26.
Randomized controlled trials (RCTs) are often considered the gold standard in evaluating whether intervention results are in line with causal claims of beneficial effects. However, given that poor design and incorrect analysis may lead to biased outcomes, simply employing an RCT is not enough to say an intervention "works." This paper applies a subset of the Society for Prevention Research (SPR) Standards of Evidence for Efficacy, Effectiveness, and Scale-up Research, with a focus on internal validity (making causal inferences) to determine the degree to which RCTs of preventive interventions are well-designed and analyzed, and whether authors provide a clear description of the methods used to report their study findings. We conducted a descriptive analysis of 851 RCTs published from 2010 to 2020 and reviewed by the Blueprints for Healthy Youth Development web-based registry of scientifically proven and scalable interventions. We used Blueprints' evaluation criteria that correspond to a subset of SPR's standards of evidence. Only 22% of the sample satisfied important criteria for minimizing biases that threaten internal validity. Overall, we identified an average of 1-2 methodological weaknesses per RCT. The most frequent sources of bias were problems related to baseline non-equivalence (i.e., differences between conditions at randomization) or differential attrition (i.e., differences between completers versus attritors or differences between study conditions that may compromise the randomization). Additionally, over half the sample (51%) had missing or incomplete tests to rule out these potential sources of bias. Most preventive intervention RCTs need improvement in rigor to permit causal inference claims that an intervention is effective. Researchers also must improve reporting of methods and results to fully assess methodological quality. These advancements will increase the usefulness of preventive interventions by ensuring the credibility and usability of RCT findings.
随机对照试验(RCT)通常被认为是评估干预效果是否符合有益效果因果推断的金标准。然而,由于设计不佳和分析不正确可能导致结果有偏倚,仅仅采用 RCT 并不能说明干预“有效”。本文应用了预防研究学会(SPR)证据标准的一个子集,重点关注内部有效性(进行因果推断),以确定预防干预 RCT 的设计和分析程度,以及作者是否清楚地描述了用于报告研究结果的方法。我们对 2010 年至 2020 年期间在 Blueprints for Healthy Youth Development 网站上注册的 851 项 RCT 进行了描述性分析,该网站注册了经过科学验证和可扩展的干预措施。我们使用了 Blueprints 的评估标准,这些标准对应于 SPR 证据标准的一个子集。只有 22%的样本满足了最小化威胁内部有效性的偏倚的重要标准。总体而言,我们发现每个 RCT 平均存在 1-2 个方法学弱点。最常见的偏倚来源是与基线不均衡(即随机分组时条件之间的差异)或差异退出(即完成者与退出者之间的差异,或可能破坏随机分组的研究条件之间的差异)有关的问题。此外,超过一半的样本(51%)存在缺失或不完整的测试,无法排除这些潜在的偏倚来源。大多数预防干预 RCT 需要提高严谨性,以允许进行干预有效的因果推断。研究人员还必须改进方法和结果的报告,以全面评估方法学质量。这些进展将通过确保 RCT 结果的可信度和可用性,提高预防干预的有用性。