Beaz-Hidalgo Roxana, Hossain Mohammad J, Liles Mark R, Figueras Maria-Jose
Unitat de Microbiologia, Departament de Ciènces Médiques Bàsiques, Facultat de Medicina i Ciències de la Salut, IISPV, Universitat Rovira i Virgili, Reus, Spain.
Department of Biological Sciences, Auburn University, Auburn, Alabama, United States of America.
PLoS One. 2015 Jan 21;10(1):e0115813. doi: 10.1371/journal.pone.0115813. eCollection 2015.
Around 27,000 prokaryote genomes are presently deposited in the Genome database of GenBank at the National Center for Biotechnology Information (NCBI) and this number is exponentially growing. However, it is not known how many of these genomes correspond correctly to their designated taxon. The taxonomic affiliation of 44 Aeromonas genomes (only five of these are type strains) deposited at the NCBI was determined by a multilocus phylogenetic analysis (MLPA) and by pairwise average nucleotide identity (ANI). Discordant results in relation to taxa assignation were found for 14 (35.9%) of the 39 non-type strain genomes on the basis of both the MLPA and ANI results. Data presented in this study also demonstrated that if the genome of the type strain is not available, a genome of the same species correctly identified can be used as a reference for ANI calculations. Of the three ANI calculating tools compared (ANI calculator, EzGenome and JSpecies), EzGenome and JSpecies provided very similar results. However, the ANI calculator provided higher intra- and inter-species values than the other two tools (differences within the ranges 0.06-0.82% and 0.92-3.38%, respectively). Nevertheless each of these tools produced the same species classification for the studied Aeromonas genomes. To avoid possible misinterpretations with the ANI calculator, particularly when values are at the borderline of the 95% cutoff, one of the other calculation tools (EzGenome or JSpecies) should be used in combination. It is recommended that once a genome sequence is obtained the correct taxonomic affiliation is verified using ANI or a MLPA before it is submitted to the NCBI and that researchers should amend the existing taxonomic errors present in databases.
目前,约27000个原核生物基因组存于美国国立生物技术信息中心(NCBI)的GenBank基因组数据库中,且这一数字正呈指数级增长。然而,尚不清楚这些基因组中有多少与它们指定的分类单元正确对应。通过多位点系统发育分析(MLPA)和成对平均核苷酸同一性(ANI)确定了存于NCBI的44个气单胞菌属基因组(其中只有5个是模式菌株)的分类归属。基于MLPA和ANI结果,在39个非模式菌株基因组中,有14个(35.9%)在分类单元指定方面存在不一致结果。本研究提供的数据还表明,如果模式菌株的基因组不可用,正确鉴定的同一物种的基因组可作为ANI计算的参考。在比较的三种ANI计算工具(ANI计算器、EzGenome和JSpecies)中,EzGenome和JSpecies提供了非常相似的结果。然而,ANI计算器提供的种内和种间值高于其他两种工具(差异分别在0.06 - 0.82%和0.92 - 3.38%范围内)。尽管如此,这些工具对所研究的气单胞菌属基因组产生了相同的物种分类。为避免使用ANI计算器可能产生的误解,特别是当值处于95%阈值的临界值时,应结合使用其他计算工具之一(EzGenome或JSpecies)。建议一旦获得基因组序列,在提交给NCBI之前,使用ANI或MLPA验证其正确的分类归属,并且研究人员应修正数据库中现有的分类错误。