Fellowship for Interpretation of Genomes, Burr Ridge, IL, United States of America.
University of Chicago, Chicago, IL, United States of America.
PLoS One. 2021 Apr 15;16(4):e0250092. doi: 10.1371/journal.pone.0250092. eCollection 2021.
Large amounts of metagenomically-derived data are submitted to PATRIC for analysis. In the future, we expect even more jobs submitted to PATRIC will use metagenomic data. One in-demand use case is the extraction of near-complete draft genomes from assembled contigs of metagenomic origin. The PATRIC metagenome binning service utilizes the PATRIC database to furnish a large, diverse set of reference genomes. We provide a new service for supervised extraction and annotation of high-quality, near-complete genomes from metagenomically-derived contigs. Reference genomes are assigned to putative draft genome bins based on the presence of single-copy universal marker roles in the sample, and contigs are sorted into these bins by their similarity to reference genomes in PATRIC. Each set of binned contigs represents a draft genome that will be annotated by RASTtk in PATRIC. A structured-language binning report is provided containing quality measurements and taxonomic information about the contig bins. The PATRIC metagenome binning service emphasizes extraction of high-quality genomes for downstream analysis using other PATRIC tools and services. Due to its supervised nature, the binning service is not appropriate for mining novel or extremely low-coverage genomes from metagenomic samples.
大量基于宏基因组学的数据被提交到 PATRIC 进行分析。在未来,我们预计将有更多提交给 PATRIC 的作业将使用宏基因组数据。一个需求较大的用例是从宏基因组来源的组装连续体中提取近乎完整的草图基因组。PATRIC 宏基因组 binning 服务利用 PATRIC 数据库提供了大量多样化的参考基因组。我们提供了一种新的服务,用于从宏基因组连续体中监督提取和注释高质量、近乎完整的基因组。参考基因组根据样本中单一拷贝通用标记角色的存在被分配到假定的草图基因组 bin 中,并且连续体根据它们与 PATRIC 中参考基因组的相似性被分类到这些 bin 中。每组 bin 的连续体代表一个将在 PATRIC 中的 RASTtk 中进行注释的草图基因组。提供了一个结构化语言的 binning 报告,其中包含有关连续体 bin 的质量测量和分类信息。PATRIC 宏基因组 binning 服务强调使用其他 PATRIC 工具和服务提取高质量的基因组,用于下游分析。由于其监督性质,binning 服务不适合从宏基因组样本中挖掘新的或极低覆盖率的基因组。