Suppr超能文献

用 Illumina 短读序列对牛津纳米孔长读序列组装的细菌病原体进行打磨,以改进基因组分析。

Polishing the Oxford Nanopore long-read assemblies of bacterial pathogens with Illumina short reads to improve genomic analyses.

机构信息

Joint Institute for Food Safety and Applied Nutrition, Center for Food Safety and Security Systems, University of Maryland, College Park, MD 20742, USA.

Joint Institute for Food Safety and Applied Nutrition, Center for Food Safety and Security Systems, University of Maryland, College Park, MD 20742, USA; Department of Nutrition and Food Science, University of Maryland, College Park, MD 20742, USA.

出版信息

Genomics. 2021 May;113(3):1366-1377. doi: 10.1016/j.ygeno.2021.03.018. Epub 2021 Mar 11.

Abstract

Oxford Nanopore sequencing has been widely used to achieve complete genomes of bacterial pathogens. However, the error rates of Oxford Nanopore long reads are high. Various polishing algorithms using Illumina short reads to correct the errors in Oxford Nanopore long-read assemblies have been developed. The impact of polishing the Oxford Nanopore long-read assemblies of bacterial pathogens with Illumina short reads on improving genomic analyses was evaluated using both simulated and real reads. Ten species (10 strains) were selected for simulated reads, while real reads were tested on 11 species (11 strains). Oxford Nanopore long reads were assembled with Unicycler to produce a draft assembly, followed by three rounds of polishing with Illumina short reads using two polishing tools, Pilon and NextPolish. One round of NextPolish polishing generated genome completeness and accuracy parameters similar to the reference genomes, whereas two or three rounds of Pilon polishing were needed, though contiguity remained unchanged after polishing. The polished assemblies of Escherichia coli O157:H7, Salmonella Typhimurium, and Cronobacter sakazakii with simulated reads did not provide accurate plasmid identifications. One round of NextPolish polishing was needed for accurately identifying plasmids in Staphylococcus aureus and E. coli O26:H11 with real reads, whereas one and two rounds of Pilon polishing were necessary for these two strains, respectively. Polishing failed to provide an accurate antimicrobial resistance (AMR) genotype for S. aureus with real reads. One round of polishing recovered an accurate AMR genotype for Klebsiella pneumoniae with real reads. The reference genome and draft assembly of Citrobacter braakii with real reads differed, which carried blaCMY-83 and fosA6, respectively, while both genes were present after one round of polishing. However, polishing did not improve the assembly of E. coli O26:H11 with real reads to achieve numbers of virulence genes similar to the reference genome. The draft and polished assemblies showed a phylogenetic tree topology comparable with the reference genomes. For multilocus sequence typing and pan-genome analyses, one round of NextPolish polishing was sufficient to obtain accurate results, while two or three rounds of Pilon polishing were needed. Overall, NextPolish outperformed Pilon for polishing the Oxford Nanopore long-read assemblies of bacterial pathogens, though both polishing strategies improved genomic analyses compared to the draft assemblies.

摘要

牛津纳米孔测序技术已被广泛应用于实现细菌病原体的全基因组测序。然而,牛津纳米孔长读长的错误率较高。为了纠正牛津纳米孔长读长组装中的错误,已经开发了各种使用 Illumina 短读长的抛光算法。使用模拟和真实读长评估了用 Illumina 短读长抛光细菌病原体的牛津纳米孔长读长组装对基因组分析的影响。选择了 10 个物种(10 株)进行模拟读长测试,而 11 个物种(11 株)则使用真实读长进行测试。使用 Unicycler 对牛津纳米孔长读长进行组装,生成草稿组装,然后使用两种抛光工具 Pilon 和 NextPolish 进行三轮 Illumina 短读长的抛光。一轮 NextPolish 抛光即可生成与参考基因组相似的基因组完整性和准确性参数,而两轮或三轮 Pilon 抛光虽然在抛光后连续性保持不变,但仍需要进行抛光。用模拟读长对大肠杆菌 O157:H7、伤寒沙门氏菌和阪崎克罗诺杆菌进行一轮抛光后,无法准确鉴定质粒。用真实读长对金黄色葡萄球菌和大肠杆菌 O26:H11 进行一轮 NextPolish 抛光即可准确鉴定质粒,而对这两个菌株则分别需要一轮和两轮 Pilon 抛光。抛光无法为金黄色葡萄球菌提供准确的抗生素耐药(AMR)基因型。用真实读长对肺炎克雷伯菌进行一轮抛光即可恢复准确的 AMR 基因型。用真实读长对产酸克雷伯菌的参考基因组和草稿组装不同,前者携带 blaCMY-83 和 fosA6,而后者在一轮抛光后均存在。然而,抛光并不能改善大肠杆菌 O26:H11 的组装,使其达到与参考基因组相似的毒力基因数量。草稿和抛光组装的系统发育树拓扑结构与参考基因组相似。对于多位点序列分型和泛基因组分析,一轮 NextPolish 抛光即可获得准确结果,而两轮或三轮 Pilon 抛光则需要。总的来说,NextPolish 比 Pilon 更适合抛光细菌病原体的牛津纳米孔长读长组装,尽管与草稿组装相比,两种抛光策略都能提高基因组分析的准确性。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验