Cote-L'Heureux Auden E, Sterner Elinor G, Maurer-Alcalá Xyrus X, Katz Laura A
Department of Biological Sciences, Smith College, Northampton, Massachusetts, USA.
Division of Invertebrate Zoology, Sackler Institute for Comparative Genomics, American Museum of Natural History, New York, New York, USA.
mBio. 2025 Apr 9;16(4):e0391624. doi: 10.1128/mbio.03916-24. Epub 2025 Mar 5.
Analyses of codon usage in eukaryotes suggest that amino acid usage responds to GC pressure so AT-biased substitutions drive higher usage of amino acids with AT-ending codons. Here, we combine single-cell transcriptomics and phylogenomics to explore codon usage patterns in foraminifera, a diverse and ancient clade of predominantly uncultivable microeukaryotes. We curate data from 1,044 gene families in 49 individuals representing 28 genera, generating perhaps the largest existing dataset of data from a predominantly uncultivable clade of protists, to analyze compositional bias and codon usage. We find extreme variation in composition, with a median GC content at fourfold degenerate silent sites below 3% in some species and above 75% in others. The most AT-biased species are distributed among diverse non-monophyletic lineages. Surprisingly, despite the extreme variation in compositional bias, amino acid usage is highly conserved across all foraminifera. By analyzing nucleotide, codon, and amino acid composition within this diverse clade of amoeboid eukaryotes, we expand our knowledge of patterns of genome evolution across the eukaryotic tree of life.IMPORTANCEPatterns of molecular evolution in protein-coding genes reflect trade-offs between substitution biases and selection on both codon and amino acid usage. Most analyses of these factors in microbial eukaryotes focus on model species such as and yeast, where substitution bias is a primary contributor to patterns of amino acid usage. Foraminifera, an ancient clade of single-celled eukaryotes, present a conundrum, as we find highly conserved amino acid usage underlain by divergent nucleotide composition, including extreme AT-bias at silent sites among multiple non-sister lineages. We speculate that these paradoxical patterns are enabled by the dynamic genome structure of foraminifera, whose life cycles can include genome endoreplication and chromatin extrusion.
对真核生物密码子使用情况的分析表明,氨基酸使用情况对GC压力有响应,因此AT偏向性替代导致以AT结尾密码子的氨基酸使用频率更高。在此,我们结合单细胞转录组学和系统基因组学,探索有孔虫类的密码子使用模式。有孔虫类是一个多样且古老的类群,主要由不可培养的微型真核生物组成。我们整理了来自49个个体中1044个基因家族的数据,这些个体代表28个属,生成了可能是现有最大的关于主要不可培养原生生物类群的数据数据集,以分析组成偏向性和密码子使用情况。我们发现组成上存在极端差异,在一些物种中,四倍简并沉默位点的GC含量中位数低于3%,而在另一些物种中则高于75%。AT偏向性最强的物种分布在不同的非单系谱系中。令人惊讶的是,尽管组成偏向性存在极端差异,但所有有孔虫类的氨基酸使用情况高度保守。通过分析这个多样的变形虫类真核生物类群中的核苷酸、密码子和氨基酸组成,我们扩展了对整个真核生物生命树基因组进化模式的认识。
重要性
蛋白质编码基因中的分子进化模式反映了替代偏向性与密码子和氨基酸使用选择之间的权衡。对微生物真核生物中这些因素的大多数分析都集中在模式物种上,如 和酵母,其中替代偏向性是氨基酸使用模式的主要影响因素。有孔虫类是单细胞真核生物的一个古老类群,呈现出一个难题,因为我们发现不同的核苷酸组成(包括多个非姐妹谱系中沉默位点的极端AT偏向性)之下存在高度保守的氨基酸使用情况。我们推测,这些矛盾的模式是由有孔虫类的动态基因组结构促成的,其生命周期可能包括基因组内复制和染色质挤出。