Suppr超能文献

不具代表性的大型调查显著高估了美国的疫苗接种率。

Unrepresentative big surveys significantly overestimated US vaccine uptake.

机构信息

Department of Statistics, University of Oxford, Oxford, UK.

Department of Political Science, Stanford University, Stanford, CA, USA.

出版信息

Nature. 2021 Dec;600(7890):695-700. doi: 10.1038/s41586-021-04198-4. Epub 2021 Dec 8.

Abstract

Surveys are a crucial tool for understanding public opinion and behaviour, and their accuracy depends on maintaining statistical representativeness of their target populations by minimizing biases from all sources. Increasing data size shrinks confidence intervals but magnifies the effect of survey bias: an instance of the Big Data Paradox. Here we demonstrate this paradox in estimates of first-dose COVID-19 vaccine uptake in US adults from 9 January to 19 May 2021 from two large surveys: Delphi-Facebook (about 250,000 responses per week) and Census Household Pulse (about 75,000 every two weeks). In May 2021, Delphi-Facebook overestimated uptake by 17 percentage points (14-20 percentage points with 5% benchmark imprecision) and Census Household Pulse by 14 (11-17 percentage points with 5% benchmark imprecision), compared to a retroactively updated benchmark the Centers for Disease Control and Prevention published on 26 May 2021. Moreover, their large sample sizes led to miniscule margins of error on the incorrect estimates. By contrast, an Axios-Ipsos online panel with about 1,000 responses per week following survey research best practices provided reliable estimates and uncertainty quantification. We decompose observed error using a recent analytic framework to explain the inaccuracy in the three surveys. We then analyse the implications for vaccine hesitancy and willingness. We show how a survey of 250,000 respondents can produce an estimate of the population mean that is no more accurate than an estimate from a simple random sample of size 10. Our central message is that data quality matters more than data quantity, and that compensating the former with the latter is a mathematically provable losing proposition.

摘要

调查是了解公众意见和行为的重要工具,其准确性取决于通过最大限度地减少来自所有来源的偏差来保持目标人群的统计代表性。增加数据量会缩小置信区间,但会放大调查偏差的影响:这就是大数据悖论的一个例子。在这里,我们从两个大型调查中展示了 2021 年 1 月 9 日至 5 月 19 日期间美国成年人首次接种 COVID-19 疫苗的估计中出现的这种悖论:德尔福-脸书(每周约有 25 万次回复)和人口普查家庭脉搏(每两周约有 7.5 万次回复)。2021 年 5 月,与疾病控制与预防中心 2021 年 5 月 26 日公布的回溯更新基准相比,德尔福-脸书高估了接种率 17 个百分点(置信度为 5%时,范围为 14-20 个百分点),人口普查家庭脉搏高估了 14 个百分点(置信度为 5%时,范围为 11-17 个百分点)。此外,它们的大样本量导致错误估计的误差非常小。相比之下,遵循调查研究最佳实践的 Axios-Ipsos 在线小组每周约有 1000 次回复,提供了可靠的估计和不确定性量化。我们使用最近的分析框架来分解观察到的错误,以解释三个调查中的不准确性。然后,我们分析了对疫苗犹豫和意愿的影响。我们展示了如何对 25 万受访者进行调查,产生的人口平均值估计与简单随机样本的估计一样不准确,而简单随机样本的规模为 10。我们的核心信息是数据质量比数据量更重要,用后者来弥补前者是一个在数学上可以证明的失败命题。

相似文献

1
Unrepresentative big surveys significantly overestimated US vaccine uptake.
Nature. 2021 Dec;600(7890):695-700. doi: 10.1038/s41586-021-04198-4. Epub 2021 Dec 8.
3
Understanding the determinants of vaccine hesitancy in the United States: A comparison of social surveys and social media.
PLoS One. 2024 Jun 6;19(6):e0301488. doi: 10.1371/journal.pone.0301488. eCollection 2024.
5
COVID-19 vaccine hesitancy in a representative working-age population in France: a survey experiment based on vaccine characteristics.
Lancet Public Health. 2021 Apr;6(4):e210-e221. doi: 10.1016/S2468-2667(21)00012-8. Epub 2021 Feb 6.
6
COVID-19 Vaccine Acceptance and Uptake in Bangkok, Thailand: Cross-sectional Online Survey.
JMIR Public Health Surveill. 2023 Apr 13;9:e40186. doi: 10.2196/40186.
7
COVID-19 Vaccine Hesitancy Among Chinese Population: A Large-Scale National Study.
Front Immunol. 2021 Nov 29;12:781161. doi: 10.3389/fimmu.2021.781161. eCollection 2021.
9
Source of information on COVID-19 vaccine and vaccine hesitancy among U.S. Medicare beneficiaries.
J Am Geriatr Soc. 2022 Mar;70(3):677-680. doi: 10.1111/jgs.17619. Epub 2021 Dec 21.

引用本文的文献

1
The Mental Health of Essential Workers during the COVID-19 Pandemic: The Role of U.S. State-level Policies.
Soc Ment Health. 2025 Mar;15(1):17-38. doi: 10.1177/21568693241226979. Epub 2024 Feb 10.
2
Psychiatric Epidemiology During the COVID-19 Pandemic.
Curr Epidemiol Rep. 2024 Jun;11(2):120-130. doi: 10.1007/s40471-024-00342-6. Epub 2024 Mar 20.
3
Real World Data Versus Probability Surveys for Estimating Health Conditions at the State Level.
J Surv Stat Methodol. 2024 Nov;12(5):1515-1530. doi: 10.1093/jssam/smae036.
5
Tracking COVID-19 Infections Using Survey Data on Rapid At-Home Tests.
JAMA Netw Open. 2024 Sep 3;7(9):e2435442. doi: 10.1001/jamanetworkopen.2024.35442.
6
Towards geospatially-resolved public-health surveillance via wastewater sequencing.
Nat Commun. 2024 Sep 27;15(1):8386. doi: 10.1038/s41467-024-52427-x.
7
A framework for understanding selection bias in real-world healthcare data.
J R Stat Soc Ser A Stat Soc. 2024 May 2;187(3):606-635. doi: 10.1093/jrsssa/qnae039. eCollection 2024 Aug.
9
Treating gaps and biases in biodiversity data as a missing data problem.
Biol Rev Camb Philos Soc. 2025 Feb;100(1):50-67. doi: 10.1111/brv.13127. Epub 2024 Aug 8.
10
Measuring vaccination coverage and concerns of vaccine holdouts from web search logs.
Nat Commun. 2024 Aug 1;15(1):6496. doi: 10.1038/s41467-024-50614-4.

本文引用的文献

1
Characterizing the Spatiotemporal Heterogeneity of the COVID-19 Vaccination Landscape.
Am J Epidemiol. 2022 Sep 28;191(10):1792-1802. doi: 10.1093/aje/kwac080.
5
Partisanship, health behavior, and policy attitudes in the early stages of the COVID-19 pandemic.
PLoS One. 2021 Apr 7;16(4):e0249596. doi: 10.1371/journal.pone.0249596. eCollection 2021.
6
Measures of the Degree of Departure from Ignorable Sample Selection.
J Surv Stat Methodol. 2020 Nov;8(5):932-964. doi: 10.1093/jssam/smz023. Epub 2019 Aug 29.
7
Doubly robust inference when combining probability and non-probability samples with high dimensional data.
J R Stat Soc Series B Stat Methodol. 2020 Apr;82(2):445-465. doi: 10.1111/rssb.12354. Epub 2020 Jan 7.
8
Big data. The parable of Google Flu: traps in big data analysis.
Science. 2014 Mar 14;343(6176):1203-5. doi: 10.1126/science.1248506.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验