Marques-Toledo Cecilia de Almeida, Degener Carolin Marlen, Vinhal Livia, Coelho Giovanini, Meira Wagner, Codeço Claudia Torres, Teixeira Mauro Martins
Departamento de Bioquimica e Imunologia do Instituto de Ciencias Biologicas, Universidade Federal de Minas Gerais, Belo Horizonte, Minas Gerais, Brazil.
Consultoria Tecnica, Ecovec LTDA, Belo Horizonte, Minas Gerais, Brazil.
PLoS Negl Trop Dis. 2017 Jul 18;11(7):e0005729. doi: 10.1371/journal.pntd.0005729. eCollection 2017 Jul.
Infectious diseases are a leading threat to public health. Accurate and timely monitoring of disease risk and progress can reduce their impact. Mentioning a disease in social networks is correlated with physician visits by patients, and can be used to estimate disease activity. Dengue is the fastest growing mosquito-borne viral disease, with an estimated annual incidence of 390 million infections, of which 96 million manifest clinically. Dengue burden is likely to increase in the future owing to trends toward increased urbanization, scarce water supplies and, possibly, environmental change. The epidemiological dynamic of Dengue is complex and difficult to predict, partly due to costly and slow surveillance systems.
METHODOLOGY / PRINCIPAL FINDINGS: In this study, we aimed to quantitatively assess the usefulness of data acquired by Twitter for the early detection and monitoring of Dengue epidemics, both at country and city level at a weekly basis. Here, we evaluated and demonstrated the potential of tweets modeling for Dengue estimation and forecast, in comparison with other available web-based data, Google Trends and Wikipedia access logs. Also, we studied the factors that might influence the goodness-of-fit of the model. We built a simple model based on tweets that was able to 'nowcast', i.e. estimate disease numbers in the same week, but also 'forecast' disease in future weeks. At the country level, tweets are strongly associated with Dengue cases, and can estimate present and future Dengue cases until 8 weeks in advance. At city level, tweets are also useful for estimating Dengue activity. Our model can be applied successfully to small and less developed cities, suggesting a robust construction, even though it may be influenced by the incidence of the disease, the activity of Twitter locally, and social factors, including human development index and internet access.
Tweets association with Dengue cases is valuable to assist traditional Dengue surveillance at real-time and low-cost. Tweets are able to successfully nowcast, i.e. estimate Dengue in the present week, but also forecast, i.e. predict Dengue at until 8 weeks in the future, both at country and city level with high estimation capacity.
传染病是对公众健康的主要威胁。准确及时地监测疾病风险和进展可以降低其影响。在社交网络中提及某种疾病与患者就医相关,可用于估计疾病活动情况。登革热是增长最快的蚊媒病毒性疾病,估计每年有3.9亿人感染,其中9600万人出现临床症状。由于城市化加剧、供水短缺以及可能的环境变化趋势,未来登革热负担可能会增加。登革热的流行病学动态复杂且难以预测,部分原因是监测系统成本高昂且速度缓慢。
方法/主要发现:在本研究中,我们旨在定量评估通过推特获取的数据在每周国家和城市层面早期检测和监测登革热疫情方面的有用性。在此,我们评估并展示了推文建模在登革热估计和预测方面的潜力,并与其他可用的基于网络的数据、谷歌趋势和维基百科访问日志进行了比较。此外,我们研究了可能影响模型拟合优度的因素。我们构建了一个基于推文的简单模型,该模型能够进行“现况预测”,即估计同一周内的疾病数量,还能“预测”未来几周的疾病情况。在国家层面,推文与登革热病例密切相关,并且能够提前8周估计当前和未来的登革热病例。在城市层面,推文对于估计登革热活动也很有用。我们的模型可以成功应用于小型和欠发达城市,表明其构建稳健,尽管它可能受到疾病发病率当地推特活动以及包括人类发展指数和互联网接入在内的社会因素的影响。
推文与登革热病例的关联对于以低成本实时协助传统登革热监测很有价值。推文能够成功进行现况预测,即在国家和城市层面估计本周的登革热情况,还能预测未来8周内的登革热情况,且估计能力较高。