Department of Industrial Engineering, Tel Aviv University, Tel Aviv 69978, Israel.
MIT Media Lab, Cambridge, MA 02139-4307, USA.
J R Soc Interface. 2021 Aug;18(181):20210284. doi: 10.1098/rsif.2021.0284. Epub 2021 Aug 4.
Current COVID-19 screening efforts mainly rely on reported symptoms and the potential exposure to infected individuals. Here, we developed a machine-learning model for COVID-19 detection that uses four layers of information: (i) sociodemographic characteristics of the individual, (ii) spatio-temporal patterns of the disease, (iii) medical condition and general health consumption of the individual and (iv) information reported by the individual during the testing episode. We evaluated our model on 140 682 members of Maccabi Health Services who were tested for COVID-19 at least once between February and October 2020. These individuals underwent, in total, 264 516 COVID-19 PCR tests, out of which 16 512 were positive. Our multi-layer model obtained an area under the curve (AUC) of 81.6% when evaluated over all the individuals in the dataset, and an AUC of 72.8% when only individuals who did not report any symptom were included. Furthermore, considering only information collected before the testing episode-i.e. before the individual had the chance to report on any symptom-our model could reach a considerably high AUC of 79.5%. Our ability to predict early on the outcomes of COVID-19 tests is pivotal for breaking transmission chains, and can be used for a more efficient testing policy.
目前的 COVID-19 筛查工作主要依赖于报告的症状和与感染个体的潜在接触。在这里,我们开发了一种使用四层信息的 COVID-19 检测机器学习模型:(i)个体的社会人口统计学特征,(ii)疾病的时空模式,(iii)个体的医疗状况和一般健康消费,以及(iv)个体在检测期间报告的信息。我们在 2020 年 2 月至 10 月期间至少接受过一次 COVID-19 检测的 140682 名 Maccabi 健康服务成员的身上评估了我们的模型。这些个体总共接受了 264516 次 COVID-19 PCR 检测,其中 16512 次呈阳性。我们的多层模型在数据集的所有个体上的曲线下面积(AUC)为 81.6%,而当仅包括未报告任何症状的个体时,AUC 为 72.8%。此外,仅考虑检测前收集的信息(即在个体有机会报告任何症状之前),我们的模型可以达到相当高的 AUC 为 79.5%。我们能够尽早预测 COVID-19 检测的结果,对于打破传播链至关重要,并且可以用于更有效的检测策略。