Laboratory of Viral Diseases, NIAID, National Institutes of Health, Bethesda, Maryland 20892-3210, USA.
J Biol Chem. 2012 Sep 7;287(37):31050-60. doi: 10.1074/jbc.M112.390054. Epub 2012 Jul 24.
Poxviruses are large DNA viruses that replicate within the cytoplasm and encode a complete transcription system, including a multisubunit RNA polymerase, stage-specific transcription factors, capping and methylating enzymes, and a poly(A) polymerase. Expression of the more than 200 open reading frames by vaccinia virus, the prototype poxvirus, is temporally regulated: early mRNAs are synthesized immediately after infection, whereas intermediate and late mRNAs are synthesized following genome replication. The postreplicative transcripts are heterogeneous in length and overlap the entire genome, which pose obstacles for high resolution mapping. We used tag-based methods in conjunction with high throughput cDNA sequencing to determine the precise 5'-capped and 3'-polyadenylated ends of postreplicative RNAs. Polymerase slippage during initiation of intermediate and late RNA synthesis results in a 5'-poly(A) leader that allowed the unambiguous identification of true transcription start sites. Ninety RNA start sites were located just upstream of intermediate and late open reading frames, but many more appeared anomalous, occurring within coding and non-coding regions, indicating pervasive transcription initiation. We confirmed the presence of functional promoter sequences upstream of representative anomalous start sites and demonstrated that alternative start sites within open reading frames could generate truncated isoforms of proteins. In an analogous manner, poly(A) sequences allowed accurate mapping of the numerous 3'-ends of postreplicative RNAs, which were preceded by a pyrimidine-rich sequence in the DNA coding strand. The distribution of postreplicative promoter sequences throughout the genome provides enormous transcriptional complexity, and the large number of previously unmapped RNAs may have novel functions.
痘病毒是在细胞质中复制的大型 DNA 病毒,它们编码一个完整的转录系统,包括多亚基 RNA 聚合酶、阶段特异性转录因子、加帽和甲基化酶以及多聚(A)聚合酶。作为原型痘病毒的牛痘病毒的超过 200 个开放阅读框的表达是受时间调控的:早期 mRNA 在感染后立即合成,而中间和晚期 mRNA 在基因组复制后合成。后复制的转录物在长度上是异质的,并且覆盖整个基因组,这给高分辨率作图带来了障碍。我们使用基于标签的方法结合高通量 cDNA 测序来确定后复制 RNA 的精确 5'端加帽和 3'端聚腺苷酸化末端。中间和晚期 RNA 合成起始时聚合酶的滑动导致了 5'端聚(A)的先导,这使得真正的转录起始位点能够被明确识别。90 个 RNA 起始位点位于中间和晚期开放阅读框的上游,但更多的起始位点似乎异常,发生在编码和非编码区域内,表明普遍的转录起始。我们在代表异常起始位点的上游确认了功能性启动子序列的存在,并证明开放阅读框内的替代起始位点可以产生蛋白质的截断同工型。类似地,多聚(A)序列允许准确映射后复制 RNA 的众多 3'末端,这些末端在 DNA 编码链上的嘧啶丰富序列之前。后复制启动子序列在基因组中的分布提供了巨大的转录复杂性,并且大量以前未映射的 RNA 可能具有新的功能。