Department of Surgery, Massachusetts General Hospital/Harvard Medical School, Boston, Massachusetts.
Department of Surgery, Massachusetts General Hospital/Harvard Medical School, Boston, Massachusetts.
J Surg Res. 2019 Sep;241:235-239. doi: 10.1016/j.jss.2019.03.062. Epub 2019 Apr 28.
Many articles in the surgical literature were faulted for committing type 2 error, or concluding no difference when the study was "underpowered". However, it is unknown if the current power standard of 0.8 is reasonable in surgical science.
PubMed was searched for abstracts published in Surgery, JAMA Surgery, and Annals of Surgery and from January 1, 2012 to December 31, 2016, with Medical Subject Heading terms of randomized controlled trial (RCT) or observational study (OBS) and limited to humans were included (n = 403). Articles were excluded if all reported findings were statistically significant (n = 193), or if presented data were insufficient to calculate power (n = 141).
A total of 69 manuscripts (59 RCTs and 10 OBSs) were assessed. Overall, the median power was 0.16 (interquartile range [IQR] 0.08-0.32). The median power was 0.16 for RCTs (IQR 0.08-0.32) and 0.14 for OBSs (IQR 0.09-0.22). Only 4 studies (5.8%) reached or exceeded the current 0.8 standard. Two-thirds of our study sample had an a priori power calculation (n = 41).
High-impact surgical science was routinely unable to reach the arbitrary power standard of 0.8. The academic surgical community should reconsider the power threshold as it applies to surgical investigations. We contend that the blueprint for the redesign should include benchmarking the power of articles on a gradient scale, instead of aiming for an unreasonable threshold.
许多外科文献中的文章因犯了第二类错误而受到批评,或者在研究“效力不足”时得出没有差异的结论。然而,目前在外科科学中 0.8 的效力标准是否合理尚不清楚。
在 Surgery、JAMA Surgery 和 Annals of Surgery 上检索 2012 年 1 月 1 日至 2016 年 12 月 31 日发表的摘要,使用随机对照试验(RCT)或观察性研究(OBS)的医学主题词,并限定为人类(n=403)。如果所有报告的发现均具有统计学意义(n=193),或者如果提供的数据不足以计算效力(n=141),则排除这些文章。
共评估了 69 篇手稿(59 项 RCT 和 10 项 OBS)。总体而言,中位数效力为 0.16(四分位距 [IQR] 0.08-0.32)。RCT 的中位数效力为 0.16(IQR 0.08-0.32),OBS 的中位数效力为 0.14(IQR 0.09-0.22)。仅有 4 项研究(5.8%)达到或超过当前的 0.8 标准。我们研究样本中有三分之二(n=41)进行了事先的效力计算。
高影响力的外科科学通常无法达到 0.8 的任意效力标准。学术外科界应重新考虑该效力阈值在外科研究中的适用性。我们认为,重新设计的蓝图应包括根据梯度尺度对标文章的效力,而不是设定不合理的阈值。