Instituto de Biotecnología y Biología Molecular, CONICET, CCT-La Plata, Departamento de Ciencias Biológicas, Facultad de Ciencias Exactas, Universidad Nacional de La Plata, La Plata, Argentina.
Instituto de Biotecnología y Biología Molecular, CONICET, CCT-La Plata, Departamento de Ciencias Biológicas, Facultad de Ciencias Exactas, Universidad Nacional de La Plata, La Plata, Argentina
mBio. 2020 Jul 21;11(4):e00766-20. doi: 10.1128/mBio.00766-20.
Prokaryote genomes exhibit a wide range of GC contents and codon usages, both resulting from an interaction between mutational bias and natural selection. In order to investigate the basis underlying specific codon changes, we performed a comprehensive analysis of 29 different prokaryote families. The analysis of core gene sets with increasing ancestries in each family lineage revealed that the codon usages became progressively more adapted to the tRNA pools. While, as previously reported, highly expressed genes presented the most optimized codon usage, the singletons contained the less selectively favored codons. The results showed that usually codons with the highest translational adaptation were preferentially enriched. In agreement with previous reports, a C bias in 2- to 3-fold pyrimidine-ending codons, and a U bias in 4-fold codons occurred in all families, irrespective of the global genomic GC content. Furthermore, the U biases suggested that U-mRNA-U-tRNA interactions were responsible for a prominent codon optimization in both the most ancestral core and the highly expressed genes. A comparative analysis of sequences that encode conserved () or variable (v) translated products, with each one being under high (HEP) and low (LEP) expression levels, demonstrated that the efficiency was more relevant (by a factor of 2) than accuracy to modeling codon usage. Finally, analysis of the third position of codons (GC3) revealed that in genomes with global GC contents higher than 35 to 40%, selection favored a GC3 increase, whereas in genomes with very low GC contents, a decrease in GC3 occurred. A comprehensive final model is presented in which all patterns of codon usage variations are condensed in four distinct behavioral groups. The prokaryotic genomes-the current heritage of the most ancient life forms on earth-are comprised of diverse gene sets, all characterized by varied origins, ancestries, and spatial-temporal expression patterns. Such genetic diversity has for a long time raised the question of how cells shape their coding strategies to optimize protein demands (i.e., product abundance) and accuracy (i.e., translation fidelity) through the use of the same genetic code in genomes with GC contents that range from less than 20 to more than 80%. Here, we present evidence on how codon usage is adjusted in the prokaryotic tree of life and on how specific biases have operated to improve translation. Through the use of proteome data, we characterized conserved and variable sequence domains in genes of either high or low expression level and quantitated the relative weight of efficiency and accuracy-as well as their interaction-in shaping codon usage in prokaryotes.
原核生物基因组表现出广泛的 GC 含量和密码子使用情况,这两者都是突变偏向性和自然选择相互作用的结果。为了研究特定密码子变化的基础,我们对 29 种不同的原核生物家族进行了全面分析。对每个家族谱系中祖先不断增加的核心基因集的分析表明,密码子的使用变得越来越适应 tRNA 库。虽然,如前所述,高表达基因呈现出最优化的密码子使用,但单倍体包含的密码子选择性较差。结果表明,通常具有最高翻译适应性的密码子被优先富集。与先前的报告一致,在所有家族中,2 到 3 倍嘧啶结尾的密码子中出现 C 偏好,4 倍密码子中出现 U 偏好,而不论全局基因组 GC 含量如何。此外,U 偏好表明 U-mRNA-U-tRNA 相互作用负责最原始核心和高表达基因中显著的密码子优化。对编码保守()或可变(v)翻译产物的序列进行比较分析,每个序列都具有高(HEP)和低(LEP)表达水平,表明效率比准确性更能(提高 2 倍)建模密码子使用。最后,对密码子第三位置(GC3)的分析表明,在全局 GC 含量高于 35%到 40%的基因组中,选择有利于 GC3 增加,而在全局 GC 含量非常低的基因组中,GC3 减少。本文提出了一个综合的最终模型,其中所有的密码子使用变化模式都浓缩在四个不同的行为组中。原核生物基因组——地球上最古老生命形式的当前遗产——由多种基因组成,这些基因都具有不同的起源、祖先和时空表达模式。这种遗传多样性长期以来一直提出一个问题,即细胞如何通过在 GC 含量从低于 20%到高于 80%的基因组中使用相同的遗传密码,来优化蛋白质需求(即产物丰度)和准确性(即翻译保真度),从而塑造其编码策略。在这里,我们提供了证据,证明了密码子使用如何在原核生物生命树中进行调整,以及特定的偏向性如何发挥作用以提高翻译效率。通过使用蛋白质组数据,我们在高或低表达水平的基因中对保守和可变序列结构域进行了特征描述,并定量了效率和准确性的相对权重——以及它们在塑造原核生物密码子使用中的相互作用。