Kim Youngchul
Department of Biostatistics and Bioinformatics, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL, USA.
Methods Mol Biol. 2023;2629:183-229. doi: 10.1007/978-1-0716-2986-4_10.
Since advances in next-generation sequencing (NGS) technique enabled to investigate uncultured microbiota and their genomes in unbiased manner, many microbiome researches have been reporting strong evidences for close links of microbiome to human health and disease. Bioinformatic and statistical analysis of NGS-based microbiome data are essential components in those microbiome researches to explore the complex composition of microbial community and understand the functions of community members in relation to host and environment. This chapter introduces bioinformatic analysis methods that generate taxonomy and functional feature count table along with phylogenetic tree from raw NGS microbiome data and then introduce statistical methods and machine learning approaches for analyzing the outputs of the bioinformatic analysis to infer the biodiversity of a microbial community and unravel host-microbiome association. Understanding the advantages and limitations of the analysis methods will help readers use the methods correctly in microbiome data analysis and may give a new opportunity to develop new analytic techniques for microbiome research.
由于新一代测序(NGS)技术的进步能够以无偏见的方式研究未培养的微生物群及其基因组,许多微生物组研究都报告了强有力的证据,证明微生物组与人类健康和疾病之间存在密切联系。基于NGS的微生物组数据的生物信息学和统计分析是这些微生物组研究中的重要组成部分,用于探索微生物群落的复杂组成,并了解群落成员与宿主和环境相关的功能。本章介绍了生物信息学分析方法,这些方法可从原始NGS微生物组数据生成分类学和功能特征计数表以及系统发育树,然后介绍用于分析生物信息学分析输出的统计方法和机器学习方法,以推断微生物群落的生物多样性并揭示宿主-微生物组关联。了解分析方法的优点和局限性将有助于读者在微生物组数据分析中正确使用这些方法,并可能为开发微生物组研究的新分析技术提供新机会。