Stanton Richard A, Vlachos Nicholas, Halpin Alison Laufer
Division of Healthcare Quality Promotion, Centers for Disease Control and Prevention, Atlanta, GA 30329, USA.
U.S. Public Health Service, Rockville, MD 20852, USA.
Bioinformatics. 2022 Jan 3;38(2):546-548. doi: 10.1093/bioinformatics/btab607.
Tools used to identify genes in microbial sequences using a reference database generally report matches as a percent identity, which can be difficult to interpret in cases with <100% sequence identity, as changes to specific amino acids can have dramatic effects on protein function, such as when they occur in substrate binding regions or enzyme active sites, which in turn can have dramatic effects on phenotypes like antimicrobial resistance or virulence.
Here, we present GAMMA, an open-source tool for Gene Allele Mutation Microbial Assessment, which uses protein coding-level identity to make gene calls from any gene database and generates a classification (e.g. mutant, truncation) and translated annotation (e.g. Y190S mutation, truncation at residue 110) for these calls. GAMMA accurately called antimicrobial resistance genes from a large set of genomes faster than three other tools. It can also be used with any gene database, as we demonstrated by identifying virulence genes in the same genome set. Because of its speed and flexibility, GAMMA can be used to rapidly find and annotate any gene matches of interest in microbial sequencing data.
GAMMA is freely available as a Bioconda package (https://bioconda.github.io/recipes/gamma/README.html) and as a command line script (https://github.com/rastanton/GAMMA).
Supplementary data are available at Bioinformatics online.
使用参考数据库在微生物序列中识别基因的工具通常将匹配结果报告为百分比一致性,在序列一致性小于100%的情况下,这可能难以解释,因为特定氨基酸的变化可能对蛋白质功能产生显著影响,例如当它们出现在底物结合区域或酶活性位点时,进而可能对抗菌抗性或毒力等表型产生显著影响。
在此,我们展示了GAMMA,一种用于基因等位基因突变微生物评估的开源工具,它使用蛋白质编码水平的一致性从任何基因数据库中进行基因调用,并为这些调用生成分类(例如突变体、截短)和翻译注释(例如Y190S突变、第110位残基处的截短)。GAMMA从一大组基因组中准确调用抗菌抗性基因的速度比其他三种工具更快。它还可以与任何基因数据库一起使用,正如我们在同一基因组集中识别毒力基因时所证明的那样。由于其速度和灵活性,GAMMA可用于在微生物测序数据中快速找到并注释任何感兴趣的基因匹配项。
GAMMA可作为Bioconda包(https://bioconda.github.io/recipes/gamma/README.html)和命令行脚本(https://github.com/rastanton/GAMMA)免费获取。
补充数据可在《生物信息学》在线获取。