Suppr超能文献

一种用于宏基因组重叠群注释和可视化的集成流程

An Integrated Pipeline for Annotation and Visualization of Metagenomic Contigs.

作者信息

Dong Xiaoli, Strous Marc

机构信息

Department of Geoscience, University of Calgary, Calgary, AB, Canada.

出版信息

Front Genet. 2019 Oct 15;10:999. doi: 10.3389/fgene.2019.00999. eCollection 2019.

Abstract

Here, we describe MetaErg, a standalone and fully automated metagenome and metaproteome annotation pipeline. Annotation of metagenomes is challenging. First, metagenomes contain sequence data of many organisms from all domains of life. Second, many of these are from understudied lineages, encoding genes with low similarity to experimentally validated reference genes. Third, assembly and binning are not perfect, sometimes resulting in artifactual hybrid contigs or genomes. To address these challenges, MetaErg provides graphical summaries of annotation outcomes, both for the complete metagenome and for individual metagenome-assembled genomes (MAGs). It performs a comprehensive annotation of each gene, including taxonomic classification, enabling functional inferences despite low similarity to reference genes, as well as detection of potential assembly or binning artifacts. When provided with metaproteome information, it visualizes gene and pathway activity using sequencing coverage and proteomic spectral counts, respectively. For visualization, MetaErg provides an HTML interface, bringing all annotation results together, and producing sortable and searchable tables, collapsible trees, and other graphic representations enabling intuitive navigation of complex data. MetaErg, implemented in Perl, HTML, and JavaScript, is a fully open source application, distributed under Academic Free License at https://github.com/xiaoli-dong/metaerg. MetaErg is also available as a docker image at https://hub.docker.com/r/xiaolidong/docker-metaerg.

摘要

在此,我们描述了MetaErg,这是一个独立的、全自动的宏基因组和宏蛋白质组注释流程。宏基因组的注释具有挑战性。首先,宏基因组包含来自生命所有领域的许多生物体的序列数据。其次,其中许多来自研究不足的谱系,其编码的基因与经过实验验证的参考基因相似度较低。第三,组装和分箱并不完美,有时会产生人为的杂交重叠群或基因组。为应对这些挑战,MetaErg提供了注释结果的图形化总结,涵盖完整的宏基因组和单个宏基因组组装基因组(MAG)。它对每个基因进行全面注释,包括分类学分类,即使与参考基因相似度较低也能进行功能推断,同时还能检测潜在的组装或分箱假象。当提供宏蛋白质组信息时,它分别使用测序覆盖度和蛋白质组学谱计数来可视化基因和通路活性。为了进行可视化,MetaErg提供了一个HTML界面,将所有注释结果整合在一起,并生成可排序和可搜索的表格、可折叠的树以及其他图形表示,以便直观地浏览复杂数据。MetaErg用Perl、HTML和JavaScript实现,是一个完全开源的应用程序,根据学术自由许可在https://github.com/xiaoli-dong/metaerg上发布。MetaErg也可作为Docker镜像在https://hub.docker.com/r/xiaolidong/docker-metaerg上获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5f7f/6803454/f7f75752eb23/fgene-10-00999-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验