Tatusova Tatiana, Ciufo Stacy, Federhen Scott, Fedorov Boris, McVeigh Richard, O'Neill Kathleen, Tolstoy Igor, Zaslavsky Leonid
National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A 8600 Rockville Pike, Bethesda, MD 20894, USA.
Nucleic Acids Res. 2015 Jan;43(Database issue):D599-605. doi: 10.1093/nar/gku1062. Epub 2014 Dec 15.
NCBI RefSeq genome collection http://www.ncbi.nlm.nih.gov/genome represents all three major domains of life: Eukarya, Bacteria and Archaea as well as Viruses. Prokaryotic genome sequences are the most rapidly growing part of the collection. During the year of 2014 more than 10,000 microbial genome assemblies have been publicly released bringing the total number of prokaryotic genomes close to 30,000. We continue to improve the quality and usability of the microbial genome resources by providing easy access to the data and the results of the pre-computed analysis, and improving analysis and visualization tools. A number of improvements have been incorporated into the Prokaryotic Genome Annotation Pipeline. Several new features have been added to RefSeq prokaryotic genomes data processing pipeline including the calculation of genome groups (clades) and the optimization of protein clusters generation using pan-genome approach.
美国国家生物技术信息中心(NCBI)的参考序列基因组集合(网址:http://www.ncbi.nlm.nih.gov/genome)涵盖了生命的所有三个主要领域:真核生物、细菌和古生菌以及病毒。原核生物基因组序列是该集合中增长最快的部分。在2014年,超过10000个微生物基因组组装序列已公开发布,使原核生物基因组总数接近30000个。我们通过提供对数据和预计算分析结果的便捷访问,并改进分析和可视化工具,继续提高微生物基因组资源的质量和可用性。原核生物基因组注释管道已进行了多项改进。参考序列原核生物基因组数据处理管道增加了几个新功能,包括基因组组(进化枝)的计算以及使用泛基因组方法优化蛋白质簇的生成。