Suppr超能文献

StandEnA:一种用于标准化注释和生成蛋白质存在-缺失矩阵的可定制工作流程。

StandEnA: a customizable workflow for standardized annotation and generating a presence-absence matrix of proteins.

作者信息

Chafra Fatma, Borim Correa Felipe, Oni Faith, Konu Karakayalı Özlen, Stadler Peter F, Nunes da Rocha Ulisses

机构信息

Department of Environmental Microbiology, Helmholtz Centre for Environmental Research-UFZ, Leipzig 04318, Germany.

Department of Molecular Biology and Genetics, Bilkent University, Ankara 06800, Turkey.

出版信息

Bioinform Adv. 2023 Jun 9;3(1):vbad069. doi: 10.1093/bioadv/vbad069. eCollection 2023.

Abstract

MOTIVATION

Several genome annotation tools standardize annotation outputs for comparability. During standardization, these tools do not allow user-friendly customization of annotation databases; limiting their flexibility and applicability in downstream analysis.

RESULTS

StandEnA is a user-friendly command-line tool for Linux that facilitates the generation of custom databases by retrieving protein sequences from multiple databases. Directed by a user-defined list of standard names, StandEnA retrieves synonyms to search for corresponding sequences in a set of public databases. Custom databases are used in prokaryotic genome annotation to generate standardized presence-absence matrices and reference files containing standard database identifiers. To showcase StandEnA, we applied it to six metagenome-assembled genomes to analyze three different pathways.

AVAILABILITY AND IMPLEMENTATION

StandEnA is an open-source software available at https://github.com/mdsufz/StandEnA.

SUPPLEMENTARY INFORMATION

Supplementary data are available at online.

摘要

动机

几种基因组注释工具对注释输出进行标准化以实现可比性。在标准化过程中,这些工具不允许用户对注释数据库进行友好的定制;限制了它们在下游分析中的灵活性和适用性。

结果

StandEnA是一个适用于Linux的用户友好型命令行工具,它通过从多个数据库中检索蛋白质序列来促进定制数据库的生成。在用户定义的标准名称列表的指导下,StandEnA检索同义词以在一组公共数据库中搜索相应的序列。定制数据库用于原核生物基因组注释,以生成标准化的存在-缺失矩阵和包含标准数据库标识符的参考文件。为了展示StandEnA,我们将其应用于六个宏基因组组装基因组,以分析三种不同的途径。

可用性和实现方式

StandEnA是一款开源软件,可在https://github.com/mdsufz/StandEnA获取。

补充信息

补充数据可在网上获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8e46/10336186/6e4f3e28a5b9/vbad069f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验