Zidane Nora, Rodrigues Carla, Bouchez Valérie, Rethoret-Pasty Martin, Passet Virginie, Brisse Sylvain, Crestani Chiara
Institut Pasteur, Université Paris Cité, Biodiversity and Epidemiology of Bacterial Pathogens, 75015 Paris, France.
Institut Pasteur, National Reference Center for Whooping Cough and Other Bordetella Infections, 75015 Paris, France.
Genome Res. 2025 Aug 1;35(8):1758-1766. doi: 10.1101/gr.279829.124.
High-throughput massive parallel sequencing has significantly improved bacterial pathogen genomics, diagnostics, and epidemiology. Despite its high accuracy, short-read sequencing struggles with the complete genome reconstruction and assembly of extrachromosomal elements such as plasmids. Long-read sequencing with Oxford Nanopore Technologies (ONT) presents an alternative that offers benefits including real-time sequencing and cost efficiency, particularly useful in resource-limited settings. However, the historically higher error rates of ONT data have so far limited its application in high-precision genomic typing. The recent release of ONT's R10.4.1 chemistry, with significantly improved raw read accuracy (Q20+), offers a potential solution to this problem. The aim of this study is to evaluate the performance of ONT's latest chemistry for bacterial genomic typing against the gold-standard Illumina technology, focusing on three respiratory pathogens of public health importance, , , and , and their related species. Using the Rapid Barcoding Kit V14, we generate and analyze genome assemblies with different basecalling models, at different simulated depths of coverage. ONT assemblies are compared to the Illumina reference for completeness and core genome multilocus sequence typing (cgMLST) accuracy (number of allelic mismatches). Our results show that genomes obtained from raw ONT data basecalled with Dorado SUP v0.9.0, assembled with Flye, and with a minimum coverage depth of 35×, optimized accuracy for all bacterial species tested. Error rates are consistently <0.5% for each cgMLST scheme, indicating that ONT R10.4.1 data are suitable for high-resolution genomic typing applied to outbreak investigations and public health surveillance.
高通量大规模平行测序显著改善了细菌病原体基因组学、诊断和流行病学。尽管其准确性很高,但短读长测序在完整基因组重建以及诸如质粒等染色体外元件的组装方面存在困难。牛津纳米孔技术公司(ONT)的长读长测序提供了一种替代方案,具有实时测序和成本效益等优势,在资源有限的环境中尤其有用。然而,ONT数据历来较高的错误率迄今限制了其在高精度基因组分型中的应用。ONT最近发布的R10.4.1化学技术,其原始读长准确性(Q20+)有显著提高,为这一问题提供了一个潜在的解决方案。本研究的目的是针对具有公共卫生重要性的三种呼吸道病原体及其相关物种,评估ONT最新化学技术在细菌基因组分型方面相对于金标准Illumina技术的性能。使用快速条形码试剂盒V14,我们在不同的模拟覆盖深度下,用不同的碱基识别模型生成并分析基因组组装。将ONT组装与Illumina参考进行比较,以评估完整性和核心基因组多位点序列分型(cgMLST)准确性(等位基因错配数)。我们的结果表明,用Dorado SUP v0.9.0对原始ONT数据进行碱基识别、用Flye进行组装,且最小覆盖深度为35×时,所有测试细菌物种的基因组准确性都得到了优化。每种cgMLST方案的错误率始终<0.5%,表明ONT R10.4.1数据适用于应用于疫情调查和公共卫生监测的高分辨率基因组分型。