1National Microbiology Laboratory, Public Health Agency of Canada, Winnipeg, MB R3E 3R2, Canada.
2University of Manitoba, Winnipeg, MB R3T 2N2, Canada.
Microb Genom. 2017 Jun 8;3(6):e000116. doi: 10.1099/mgen.0.000116. eCollection 2017 Jun 30.
The recent widespread application of whole-genome sequencing (WGS) for microbial disease investigations has spurred the development of new bioinformatics tools, including a notable proliferation of phylogenomics pipelines designed for infectious disease surveillance and outbreak investigation. Transitioning the use of WGS data out of the research laboratory and into the front lines of surveillance and outbreak response requires user-friendly, reproducible and scalable pipelines that have been well validated. Single Nucleotide Variant Phylogenomics (SNVPhyl) is a bioinformatics pipeline for identifying high-quality single-nucleotide variants (SNVs) and constructing a whole-genome phylogeny from a collection of WGS reads and a reference genome. Individual pipeline components are integrated into the Galaxy bioinformatics framework, enabling data analysis in a user-friendly, reproducible and scalable environment. We show that SNVPhyl can detect SNVs with high sensitivity and specificity, and identify and remove regions of high SNV density (indicative of recombination). SNVPhyl is able to correctly distinguish outbreak from non-outbreak isolates across a range of variant-calling settings, sequencing-coverage thresholds or in the presence of contamination. SNVPhyl is available as a Galaxy workflow, Docker and virtual machine images, and a Unix-based command-line application. SNVPhyl is released under the Apache 2.0 license and available at http://snvphyl.readthedocs.io/ or at https://github.com/phac-nml/snvphyl-galaxy.
最近,全基因组测序(WGS)在微生物疾病研究中的广泛应用,推动了新的生物信息学工具的发展,包括用于传染病监测和疫情调查的系统发育组学管道的显著增加。要将 WGS 数据的应用从研究实验室转移到监测和疫情应对的第一线,需要用户友好、可重复和可扩展的、经过良好验证的管道。单核苷酸变异系统发育组学(SNVPhyl)是一种从 WGS 读数和参考基因组的集合中识别高质量单核苷酸变异(SNV)并构建全基因组系统发育的生物信息学管道。单个管道组件集成到 Galaxy 生物信息学框架中,可在用户友好、可重复和可扩展的环境中进行数据分析。我们表明,SNVPhyl 可以以高灵敏度和特异性检测 SNV,并识别和去除高 SNV 密度区域(表明重组)。SNVPhyl 能够在一系列变异调用设置、测序覆盖阈值或存在污染的情况下,正确区分疫情和非疫情分离株。SNVPhyl 以 Galaxy 工作流程、Docker 和虚拟机映像以及基于 Unix 的命令行应用程序的形式提供。SNVPhyl 是在 Apache 2.0 许可证下发布的,并可在 http://snvphyl.readthedocs.io/ 或 https://github.com/phac-nml/snvphyl-galaxy 上获得。