Liu Pu, Hu Shuofeng, He Zhen, Feng Chao, Dong Guohua, An Sijing, Liu Runyan, Xu Fang, Chen Yaowen, Ying Xiaomin
Center for Computational Biology, Beijing Institute of Basic Medical Sciences, Beijing, China.
Yongkang First People's Hospital, Yongkang, China.
Front Microbiol. 2022 May 5;13:828254. doi: 10.3389/fmicb.2022.828254. eCollection 2022.
Intestinal bacteria strains play crucial roles in maintaining host health. Researchers have increasingly recognized the importance of strain-level analysis in metagenomic studies. Many analysis tools and several cutting-edge sequencing techniques like single cell sequencing have been proposed to decipher strains in metagenomes. However, strain-level complexity is far from being well characterized up to date. As the indicator of strain-level complexity, metagenomic single-nucleotide polymorphisms (SNPs) have been utilized to disentangle conspecific strains. Lots of SNP-based tools have been developed to identify strains in metagenomes. However, the sufficient sequencing depth for SNP and strain-level analysis remains unclear. We conducted ultra-deep sequencing of the human gut microbiome and constructed an unbiased framework to perform reliable SNP analysis. SNP profiles of the human gut metagenome by ultra-deep sequencing were obtained. SNPs identified from conventional and ultra-deep sequencing data were thoroughly compared and the relationship between SNP identification and sequencing depth were investigated. The results show that the commonly used shallow-depth sequencing is incapable to support a systematic metagenomic SNP discovery. In contrast, ultra-deep sequencing could detect more functionally important SNPs, which leads to reliable downstream analyses and novel discoveries. We also constructed a machine learning model to provide guidance for researchers to determine the optimal sequencing depth for their projects (SNPsnp, https://github.com/labomics/SNPsnp). To conclude, the SNP profiles based on ultra-deep sequencing data extend current knowledge on metagenomics and highlights the importance of evaluating sequencing depth before starting SNP analysis. This study provides new ideas and references for future strain-level investigations.
肠道细菌菌株在维持宿主健康方面发挥着关键作用。研究人员越来越认识到菌株水平分析在宏基因组学研究中的重要性。已经提出了许多分析工具以及几种前沿测序技术,如单细胞测序,以解析宏基因组中的菌株。然而,迄今为止,菌株水平的复杂性远未得到充分表征。作为菌株水平复杂性的指标,宏基因组单核苷酸多态性(SNP)已被用于区分同种菌株。已经开发了许多基于SNP的工具来识别宏基因组中的菌株。然而,SNP和菌株水平分析所需的足够测序深度仍不清楚。我们对人类肠道微生物组进行了超深度测序,并构建了一个无偏框架来进行可靠的SNP分析。通过超深度测序获得了人类肠道宏基因组的SNP图谱。对从传统测序数据和超深度测序数据中鉴定出的SNP进行了全面比较,并研究了SNP鉴定与测序深度之间的关系。结果表明,常用的浅深度测序无法支持系统的宏基因组SNP发现。相比之下,超深度测序可以检测到更多功能上重要的SNP,从而实现可靠的下游分析和新发现。我们还构建了一个机器学习模型,为研究人员确定其项目的最佳测序深度提供指导(SNPsnp,https://github.com/labomics/SNPsnp)。总之,基于超深度测序数据的SNP图谱扩展了当前关于宏基因组学的知识,并突出了在开始SNP分析之前评估测序深度的重要性。本研究为未来的菌株水平研究提供了新的思路和参考。