Computational Biology of Infection Research, Helmholtz Centre for Infection Research, Inhoffenstraße 7, 38124 Braunschweig, Germany.
German Center for Infection Research (DZIF), Partner Site Hannover-Braunschweig, Inhoffenstraße 7, 38124 Braunschweig, Germany.
Gigascience. 2020 Jan 1;9(1). doi: 10.1093/gigascience/giz154.
The number of microbial genome sequences is increasing exponentially, especially thanks to recent advances in recovering complete or near-complete genomes from metagenomes and single cells. Assigning reliable taxon labels to genomes is key and often a prerequisite for downstream analyses.
We introduce CAMITAX, a scalable and reproducible workflow for the taxonomic labelling of microbial genomes recovered from isolates, single cells, and metagenomes. CAMITAX combines genome distance-, 16S ribosomal RNA gene-, and gene homology-based taxonomic assignments with phylogenetic placement. It uses Nextflow to orchestrate reference databases and software containers and thus combines ease of installation and use with computational reproducibility. We evaluated the method on several hundred metagenome-assembled genomes with high-quality taxonomic annotations from the TARA Oceans project, and we show that the ensemble classification method in CAMITAX improved on all individual methods across tested ranks.
While we initially developed CAMITAX to aid the Critical Assessment of Metagenome Interpretation (CAMI) initiative, it evolved into a comprehensive software package to reliably assign taxon labels to microbial genomes. CAMITAX is available under Apache License 2.0 at https://github.com/CAMI-challenge/CAMITAX.
微生物基因组序列的数量呈指数级增长,这主要得益于最近在从宏基因组和单细胞中恢复完整或近乎完整基因组方面的进展。为基因组分配可靠的分类标签是关键的,通常也是下游分析的前提。
我们引入了 CAMITAX,这是一种可扩展且可重复的工作流程,用于对从分离物、单细胞和宏基因组中恢复的微生物基因组进行分类标记。CAMITAX 将基于基因组距离、16S 核糖体 RNA 基因和基因同源性的分类分配与系统发育定位相结合。它使用 Nextflow 来协调参考数据库和软件容器,从而将安装和使用的便利性与计算的可重复性结合起来。我们在 TARA 海洋项目中具有高质量分类注释的数百个宏基因组组装基因组上评估了该方法,结果表明,CAMITAX 中的集成分类方法在测试的所有等级上都优于所有单个方法。
虽然我们最初开发 CAMITAX 是为了帮助宏基因组解释评估(CAMI)倡议,但它已经发展成为一个可靠地为微生物基因组分配分类标签的综合软件包。CAMITAX 可在 Apache License 2.0 下在 https://github.com/CAMI-challenge/CAMITAX 获得。