Deloger Marc, El Karoui Meriem, Petit Marie-Agnès
INRA, UR888, F78350, Jouy en Josas, France.
J Bacteriol. 2009 Jan;191(1):91-9. doi: 10.1128/JB.01202-08. Epub 2008 Oct 31.
The fundamental unit of biological diversity is the species. However, a remarkable extent of intraspecies diversity in bacteria was discovered by genome sequencing, and it reveals the need to develop clear criteria to group strains within a species. Two main types of analyses used to quantify intraspecies variation at the genome level are the average nucleotide identity (ANI), which detects the DNA conservation of the core genome, and the DNA content, which calculates the proportion of DNA shared by two genomes. Both estimates are based on BLAST alignments for the definition of DNA sequences common to the genome pair. Interestingly, however, results using these methods on intraspecies pairs are not well correlated. This prompted us to develop a genomic-distance index taking into account both criteria of diversity, which are based on DNA maximal unique matches (MUM) shared by two genomes. The values, called MUMi, for MUM index, correlate better with the ANI than with the DNA content. Moreover, the MUMi groups strains in a way that is congruent with routinely used multilocus sequence-typing trees, as well as with ANI-based trees. We used the MUMi to determine the relatedness of all available genome pairs at the species and genus levels. Our analysis reveals a certain consistency in the current notion of bacterial species, in that the bulk of intraspecies and intragenus values are clearly separable. It also confirms that some species are much more diverse than most. As the MUMi is fast to calculate, it offers the possibility of measuring genome distances on the whole database of available genomes.
生物多样性的基本单位是物种。然而,通过基因组测序发现细菌种内存在显著程度的多样性,这表明需要制定明确的标准来对物种内的菌株进行分类。用于在基因组水平量化种内变异的两种主要分析类型是平均核苷酸同一性(ANI),它检测核心基因组的DNA保守性,以及DNA含量,它计算两个基因组共享的DNA比例。这两种估计都是基于BLAST比对来定义基因组对共有的DNA序列。然而,有趣的是,在种内对使用这些方法得到的结果相关性并不好。这促使我们开发一种考虑到两种多样性标准的基因组距离指数,这两种标准基于两个基因组共享的DNA最大独特匹配(MUM)。MUM指数的值(称为MUMi)与ANI的相关性比与DNA含量的相关性更好。此外,MUMi对菌株的分组方式与常规使用的多位点序列分型树以及基于ANI的树一致。我们使用MUMi来确定物种和属水平上所有可用基因组对的亲缘关系。我们的分析揭示了当前细菌物种概念中的某种一致性,即种内和属内的大部分值明显可分。它还证实了一些物种比大多数物种更加多样。由于MUMi计算速度快,它提供了在整个可用基因组数据库上测量基因组距离的可能性。