Suppr超能文献

SEED 与利用子系统技术进行快速微生物基因组注释(RAST)。

The SEED and the Rapid Annotation of microbial genomes using Subsystems Technology (RAST).

机构信息

Fellowship for Interpretation of Genomes, Burr Ridge, IL 60527, USA, Mathematics and Computer Science Division, Argonne National Laboratory, Argonne, IL 60439, USA, Computation Institute, University of Chicago, Chicago, IL 60637, USA, Department of Microbiology, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA, Department of Computer Science, San Diego State University, San Diego, CA 92182, USA, Virginia Bioinformatics Institute, Virginia Tech, Blacksburg, VA 24060, USA, Computing, Environment and Life Sciences, Argonne National Laboratory, Argonne, IL 60439, USA and Department of Computer Science, University of Chicago, Chicago, IL 60637, USA.

出版信息

Nucleic Acids Res. 2014 Jan;42(Database issue):D206-14. doi: 10.1093/nar/gkt1226. Epub 2013 Nov 29.

Abstract

In 2004, the SEED (http://pubseed.theseed.org/) was created to provide consistent and accurate genome annotations across thousands of genomes and as a platform for discovering and developing de novo annotations. The SEED is a constantly updated integration of genomic data with a genome database, web front end, API and server scripts. It is used by many scientists for predicting gene functions and discovering new pathways. In addition to being a powerful database for bioinformatics research, the SEED also houses subsystems (collections of functionally related protein families) and their derived FIGfams (protein families), which represent the core of the RAST annotation engine (http://rast.nmpdr.org/). When a new genome is submitted to RAST, genes are called and their annotations are made by comparison to the FIGfam collection. If the genome is made public, it is then housed within the SEED and its proteins populate the FIGfam collection. This annotation cycle has proven to be a robust and scalable solution to the problem of annotating the exponentially increasing number of genomes. To date, >12 000 users worldwide have annotated >60 000 distinct genomes using RAST. Here we describe the interconnectedness of the SEED database and RAST, the RAST annotation pipeline and updates to both resources.

摘要

2004 年,SEED(http://pubseed.theseed.org/)被创建,旨在为数千个基因组提供一致和准确的基因组注释,并作为发现和开发从头注释的平台。SEED 是一个不断更新的基因组数据与基因组数据库、网络前端、API 和服务器脚本的集成。许多科学家都使用它来预测基因功能和发现新的途径。除了作为生物信息学研究的强大数据库外,SEED 还包含子系统(功能相关的蛋白质家族集合)及其衍生的 FIGfam(蛋白质家族),这代表了 RAST 注释引擎(http://rast.nmpdr.org/)的核心。当一个新的基因组提交给 RAST 时,通过与 FIGfam 集合进行比较来调用基因并对其进行注释。如果基因组是公开的,那么它将被存储在 SEED 中,其蛋白质将填充 FIGfam 集合。这种注释循环已被证明是解决注释数量呈指数级增长的问题的一种强大且可扩展的解决方案。迄今为止,全球有超过 12000 名用户使用 RAST 注释了超过 60000 个不同的基因组。在这里,我们描述了 SEED 数据库和 RAST、RAST 注释管道以及这两个资源的更新之间的相互关系。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/87ae/3965101/87cc28a68058/gkt1226f1p.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验