Department of Human Genetics, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA, USA.
Department of Immunology and Microbiology, The Scripps Research Institute, La Jolla, CA, USA.
Nat Methods. 2023 Apr;20(4):512-522. doi: 10.1038/s41592-023-01769-3. Epub 2023 Feb 23.
In response to the emergence of SARS-CoV-2 variants of concern, the global scientific community, through unprecedented effort, has sequenced and shared over 11 million genomes through GISAID, as of May 2022. This extraordinarily high sampling rate provides a unique opportunity to track the evolution of the virus in near real-time. Here, we present outbreak.info , a platform that currently tracks over 40 million combinations of Pango lineages and individual mutations, across over 7,000 locations, to provide insights for researchers, public health officials and the general public. We describe the interpretable visualizations available in our web application, the pipelines that enable the scalable ingestion of heterogeneous sources of SARS-CoV-2 variant data and the server infrastructure that enables widespread data dissemination via a high-performance API that can be accessed using an R package. We show how outbreak.info can be used for genomic surveillance and as a hypothesis-generation tool to understand the ongoing pandemic at varying geographic and temporal scales.
针对 SARS-CoV-2 变异株的出现,全球科学界通过前所未有的努力,截至 2022 年 5 月,通过 GISAID 对超过 1100 万份基因组进行了测序和共享。如此高的采样率为实时跟踪病毒的进化提供了独特的机会。在这里,我们介绍 outbreak.info,这是一个平台,目前在 7000 多个地点跟踪超过 4000 万个 Pango 谱系和个体突变的组合,为研究人员、公共卫生官员和公众提供了深入了解病毒的机会。我们描述了我们的网络应用程序中可用的可解释可视化效果、可实现 SARS-CoV-2 变异数据异质源可扩展摄取的管道,以及通过高性能 API 实现广泛数据分发的服务器基础设施,该 API 可使用 R 包访问。我们展示了 outbreak.info 如何用于基因组监测以及作为生成假设的工具,以在不同的地理和时间尺度上了解正在进行的大流行。