Zenker Sanja, Wulf Donat, Meierhenrich Anja, Viehöver Prisca, Becker Sarah, Eisenhut Marion, Stracke Ralf, Weisshaar Bernd, Bräutigam Andrea
Computational Biology, Faculty of Biology, Bielefeld University, 33615 Bielefeld, Germany.
Center of Biotechnology (CeBiTec), Bielefeld University, 33615 Bielefeld, Germany.
Plant Physiol. 2025 May 30;198(2). doi: 10.1093/plphys/kiaf205.
Transcription factors control gene expression during development and in response to a broad range of internal and external stimuli. They regulate promoter activity by directly binding cis-regulatory elements in DNA. The angiosperm Arabidopsis (Arabidopsis thaliana) contains more than 1,500 annotated transcription factors, each containing a DNA-binding domain that is used to define transcription factor families. Analyzing the binding motifs of 686 and the binding sites of 335 Arabidopsis transcription factors, as well as motifs of 92 transcription factors from other plants, we identified a constrained vocabulary of 74 conserved motifs spanning 50 families in plants. Among 21 transcription factor families, we found 1 core motif for all analyzed members and between 2% and 72% overlapping binding sites. Five families show conservation of the motif along phylogenetic clades. Five families, including the C2H2 zinc finger family, show high diversity among motifs in plants, suggesting potential for the neofunctionalization of duplicated transcription factors based on the motif recognized. We tested whether conserved motifs remained conserved since at least 450 million years ago by determining the binding motifs of 17 transcription factors from 11 families in Marchantia (Marchantia polymorpha) using amplified DNA affinity purification sequencing. We detected nearly identical binding motifs as predicted from the angiosperm data. Our findings show a large repertoire of overlapping binding sites within a transcription factor family and species and a high degree of binding motif conservation for at least 450 million years, indicating more potential for evolution in cis- rather than trans-regulatory elements.
转录因子在发育过程中以及对广泛的内部和外部刺激作出反应时控制基因表达。它们通过直接结合DNA中的顺式调控元件来调节启动子活性。被子植物拟南芥(Arabidopsis thaliana)含有1500多个注释的转录因子,每个转录因子都包含一个用于定义转录因子家族的DNA结合结构域。通过分析686个拟南芥转录因子的结合基序、335个转录因子的结合位点以及其他植物的92个转录因子的基序,我们确定了一个由74个保守基序组成的受限词汇表,这些基序跨越了植物中的50个家族。在21个转录因子家族中,我们发现所有分析成员都有1个核心基序,且结合位点的重叠率在2%至72%之间。有5个家族的基序在系统发育分支中表现出保守性。包括C2H2锌指家族在内的5个家族在植物基序中表现出高度多样性,这表明基于所识别的基序,重复的转录因子可能发生新功能化。我们通过使用扩增DNA亲和纯化测序法确定了地钱(Marchantia polymorpha)中11个家族的17个转录因子的结合基序,以此来测试保守基序是否至少自4.5亿年前以来一直保持保守。我们检测到的结合基序与从被子植物数据预测的几乎相同。我们的研究结果表明,转录因子家族和物种内存在大量重叠的结合位点,并且至少4.5亿年来结合基序具有高度保守性,这表明顺式调控元件而非反式调控元件具有更大的进化潜力。