Department of Genetics, Evolution and Environment, University College London, Darwin Building, Gower Street, London WC1E 6BT, United Kingdom of Great Britain and Northern Ireland.
Department of Genetics, Evolution and Environment, University College London, Darwin Building, Gower Street, London WC1E 6BT, United Kingdom of Great Britain and Northern Ireland.
Biochim Biophys Acta Bioenerg. 2022 Nov 1;1863(8):148597. doi: 10.1016/j.bbabio.2022.148597. Epub 2022 Jul 19.
The origin of the genetic code is an abiding mystery in biology. Hints of a 'code within the codons' suggest biophysical interactions, but these patterns have resisted interpretation. Here, we present a new framework, grounded in the autotrophic growth of protocells from CO and H. Recent work suggests that the universal core of metabolism recapitulates a thermodynamically favoured protometabolism right up to nucleotide synthesis. Considering the genetic code in relation to an extended protometabolism allows us to predict most codon assignments. We show that the first letter of the codon corresponds to the distance from CO fixation, with amino acids encoded by the purines (G followed by A) being closest to CO fixation. These associations suggest a purine-rich early metabolism with a restricted pool of amino acids. The second position of the anticodon corresponds to the hydrophobicity of the amino acid encoded. We combine multiple measures of hydrophobicity to show that this correlation holds strongly for early amino acids but is weaker for later species. Finally, we demonstrate that redundancy at the third position is not randomly distributed around the code: non-redundant amino acids can be assigned based on size, specifically length. We attribute this to additional stereochemical interactions at the anticodon. These rules imply an iterative expansion of the genetic code over time with codon assignments depending on both distance from CO and biophysical interactions between nucleotide sequences and amino acids. In this way the earliest RNA polymers could produce non-random peptide sequences with selectable functions in autotrophic protocells.
遗传密码的起源是生物学中一个持久的谜团。密码子内“代码”的暗示表明存在生物物理相互作用,但这些模式一直难以解释。在这里,我们提出了一个新的框架,该框架基于从 CO 和 H 中自养生长的原细胞。最近的工作表明,新陈代谢的普遍核心再现了一种热力学有利的原代谢,直到核苷酸合成。从扩展的原代谢的角度考虑遗传密码,使我们能够预测大多数密码子的分配。我们表明,密码子的第一个字母对应于 CO 固定的距离,嘌呤(紧随 G 之后的 A)编码的氨基酸离 CO 固定最近。这些关联表明存在富含嘌呤的早期代谢,其氨基酸池受到限制。反密码子的第二个位置对应于编码氨基酸的疏水性。我们结合多种疏水性度量来表明,这种相关性在早期氨基酸中非常强,但在后期物种中较弱。最后,我们证明第三个位置的冗余不是在代码周围随机分布的:非冗余氨基酸可以根据大小,特别是长度来分配。我们将其归因于反密码子中的额外立体化学相互作用。这些规则意味着遗传密码随着时间的推移不断扩展,密码子的分配取决于与 CO 的距离以及核苷酸序列和氨基酸之间的生物物理相互作用。通过这种方式,最早的 RNA 聚合物可以在自养原细胞中产生具有选择性功能的非随机肽序列。