Suppr超能文献

ITEP:用于探索微生物泛基因组的集成工具包。

ITEP: an integrated toolkit for exploration of microbial pan-genomes.

机构信息

Institute for Systems Biology, 401 Terry Ave, N,, Seattle, WA 98109, USA.

出版信息

BMC Genomics. 2014 Jan 3;15:8. doi: 10.1186/1471-2164-15-8.

Abstract

BACKGROUND

Comparative genomics is a powerful approach for studying variation in physiological traits as well as the evolution and ecology of microorganisms. Recent technological advances have enabled sequencing large numbers of related genomes in a single project, requiring computational tools for their integrated analysis. In particular, accurate annotations and identification of gene presence and absence are critical for understanding and modeling the cellular physiology of newly sequenced genomes. Although many tools are available to compare the gene contents of related genomes, new tools are necessary to enable close examination and curation of protein families from large numbers of closely related organisms, to integrate curation with the analysis of gain and loss, and to generate metabolic networks linking the annotations to observed phenotypes.

RESULTS

We have developed ITEP, an Integrated Toolkit for Exploration of microbial Pan-genomes, to curate protein families, compute similarities to externally-defined domains, analyze gene gain and loss, and generate draft metabolic networks from one or more curated reference network reconstructions in groups of related microbial species among which the combination of core and variable genes constitute the their "pan-genomes". The ITEP toolkit consists of: (1) a series of modular command-line scripts for identification, comparison, curation, and analysis of protein families and their distribution across many genomes; (2) a set of Python libraries for programmatic access to the same data; and (3) pre-packaged scripts to perform common analysis workflows on a collection of genomes. ITEP's capabilities include de novo protein family prediction, ortholog detection, analysis of functional domains, identification of core and variable genes and gene regions, sequence alignments and tree generation, annotation curation, and the integration of cross-genome analysis and metabolic networks for study of metabolic network evolution.

CONCLUSIONS

ITEP is a powerful, flexible toolkit for generation and curation of protein families. ITEP's modular design allows for straightforward extension as analysis methods and tools evolve. By integrating comparative genomics with the development of draft metabolic networks, ITEP harnesses the power of comparative genomics to build confidence in links between genotype and phenotype and helps disambiguate gene annotations when they are evaluated in both evolutionary and metabolic network contexts.

摘要

背景

比较基因组学是研究生理特征变异、微生物进化和生态学的有力方法。最近的技术进步使得在单个项目中能够对大量相关基因组进行测序,这需要计算工具来对其进行综合分析。特别是,准确注释和鉴定基因的存在和缺失对于理解和模拟新测序基因组的细胞生理学至关重要。尽管有许多工具可用于比较相关基因组的基因内容,但需要新的工具来仔细检查和整理大量密切相关的生物体的蛋白质家族,将整理与增益和损耗的分析相结合,并生成将注释与观察到的表型联系起来的代谢网络。

结果

我们开发了 ITEP,即微生物泛基因组探索综合工具包,用于对蛋白质家族进行整理,计算与外部定义的结构域的相似性,分析基因的增益和损耗,并从一个或多个经过整理的参考网络重建中生成草稿代谢网络,这些参考网络重建中的核心和可变基因组合构成了它们的“泛基因组”。ITEP 工具包包括:(1)一系列用于鉴定、比较、整理和分析蛋白质家族及其在许多基因组中的分布的模块化命令行脚本;(2)一组用于对同一数据进行编程访问的 Python 库;(3)用于对基因组集合执行常见分析工作流程的预打包脚本。ITEP 的功能包括从头预测蛋白质家族、检测直系同源物、分析功能域、鉴定核心和可变基因和基因区域、序列比对和树生成、注释整理以及跨基因组分析和代谢网络的集成,用于研究代谢网络进化。

结论

ITEP 是一个强大、灵活的蛋白质家族生成和整理工具包。ITEP 的模块化设计允许随着分析方法和工具的发展而进行直接扩展。通过将比较基因组学与草稿代谢网络的开发相结合,ITEP 利用比较基因组学的力量,在进化和代谢网络背景下评估时,建立基因型和表型之间的联系的信心,并帮助消除基因注释的歧义。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7b64/3890548/0629be6f506e/1471-2164-15-8-1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验