Lopez Philippe, Halary Sébastien, Bapteste Eric
Team 'Adaptation, Integration, Reticulation, Evolution' - UMR CNRS 7138 Evolution Paris Seine - Institut de Biologie Paris Seine - Université Pierre et Marie Curie, 7 quai St Bernard, 75005, Paris, France.
Département de Sciences Biologiques, Institut de recherche en biologie végétale, Université de Montréal, Montréal, QC, H1X 2B2, Canada.
Biol Direct. 2015 Oct 26;10:64. doi: 10.1186/s13062-015-0092-3.
Microbial genetic diversity is often investigated via the comparison of relatively similar 16S molecules through multiple alignments between reference sequences and novel environmental samples using phylogenetic trees, direct BLAST matches, or phylotypes counts. However, are we missing novel lineages in the microbial dark universe by relying on standard phylogenetic and BLAST methods? If so, how can we probe that universe using alternative approaches? We performed a novel type of multi-marker analysis of genetic diversity exploiting the topology of inclusive sequence similarity networks.
Our protocol identified 86 ancient gene families, well distributed and rarely transferred across the 3 domains of life, and retrieved their environmental homologs among 10 million predicted ORFs from human gut samples and other metagenomic projects. Numerous highly divergent environmental homologs were observed in gut samples, although the most divergent genes were over-represented in non-gut environments. In our networks, most divergent environmental genes grouped exclusively with uncultured relatives, in maximal cliques. Sequences within these groups were under strong purifying selection and presented a range of genetic variation comparable to that of a prokaryotic domain.
Many genes families included environmental homologs that were highly divergent from cultured homologs: in 79 gene families (including 18 ribosomal proteins), Bacteria and Archaea were less divergent than some groups of environmental sequences were to any cultured or viral homologs. Moreover, some groups of environmental homologs branched very deeply in phylogenetic trees of life, when they were not too divergent to be aligned. These results underline how limited our understanding of the most diverse elements of the microbial world remains, and encourage a deeper exploration of natural communities and their genetic resources, hinting at the possibility that still unknown yet major divisions of life have yet to be discovered.
微生物遗传多样性通常通过系统发育树、直接BLAST比对或系统型计数,对参考序列与新的环境样本进行多重比对,比较相对相似的16S分子来研究。然而,依靠标准的系统发育和BLAST方法,我们是否遗漏了微生物暗物质世界中的新谱系?如果是这样,我们如何使用替代方法探索那个世界?我们利用包容性序列相似性网络的拓扑结构进行了一种新型的遗传多样性多标记分析。
我们的方案鉴定出86个古老的基因家族,它们分布良好且很少在生命的三个域之间转移,并在来自人类肠道样本和其他宏基因组项目的1000万个预测开放阅读框中检索到它们的环境同源物。在肠道样本中观察到许多高度分化的环境同源物,尽管分化程度最高的基因在非肠道环境中占比过高。在我们的网络中,大多数分化程度最高的环境基因仅与未培养的亲缘种聚集在最大团中。这些组内的序列处于强烈的纯化选择之下,呈现出与原核域相当的遗传变异范围。
许多基因家族包含与培养同源物高度分化的环境同源物:在79个基因家族(包括18个核糖体蛋白)中,细菌和古菌的分化程度低于某些环境序列组与任何培养或病毒同源物的分化程度。此外,一些环境同源物组在生命系统发育树中分支非常深,前提是它们的分化程度不至于无法比对。这些结果强调了我们对微生物世界中最多样化元素的理解仍然多么有限,并鼓励对自然群落及其遗传资源进行更深入的探索,暗示仍有可能发现尚未知晓的生命主要分支。