National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA.
Archaeal Virology Unit, Institut Pasteur, Université Paris Cité, Paris, France.
Virol J. 2024 Aug 26;21(1):200. doi: 10.1186/s12985-024-02482-z.
Viruses with double-stranded (ds) DNA genomes in the realm Duplodnaviria share a conserved structural gene module but show a broad range of variation in their repertoires of DNA replication proteins. Some of the duplodnaviruses encode (nearly) complete replication systems whereas others lack (almost) all genes required for replication, relying on the host replication machinery. DNA polymerases (DNAPs) comprise the centerpiece of the DNA replication apparatus. The replicative DNAPs are classified into 4 unrelated or distantly related families (A-D), with the protein structures and sequences within each family being, generally, highly conserved. More than half of the duplodnaviruses encode a DNAP of family A, B or C. We showed previously that multiple pairs of closely related viruses in the order Crassvirales encode DNAPs of different families.
Groups of phages in which DNAP swapping likely occurred were identified as subtrees of a defined depth in a comprehensive evolutionary tree of tailed bacteriophages that included phages with DNAPs of different families. The DNAP swaps were validated by constrained tree analysis that was performed on phylogenetic tree of large terminase subunits, and the phage genomes encoding swapped DNAPs were aligned using Mauve. The structures of the discovered unusual DNAPs were predicted using AlphaFold2.
We identified four additional groups of tailed phages in the class Caudoviricetes in which the DNAPs apparently were swapped on multiple occasions, with replacements occurring both between families A and B, or A and C, or between distinct subfamilies within the same family. The DNAP swapping always occurs "in situ", without changes in the organization of the surrounding genes. In several cases, the DNAP gene is the only region of substantial divergence between closely related phage genomes, whereas in others, the swap apparently involved neighboring genes encoding other proteins involved in phage genome replication. In addition, we identified two previously undetected, highly divergent groups of family A DNAPs that are encoded in some phage genomes along with the main DNAP implicated in genome replication.
Replacement of the DNAP gene by one encoding a DNAP of a different family occurred on many independent occasions during the evolution of different families of tailed phages, in some cases, resulting in very closely related phages encoding unrelated DNAPs. DNAP swapping was likely driven by selection for avoidance of host antiphage mechanisms targeting the phage DNAP that remain to be identified, and/or by selection against replicon incompatibility.
双链 (ds) DNA 基因组领域的 Duplodnaviria 病毒共享保守的结构基因模块,但在其 DNA 复制蛋白库中表现出广泛的变异。一些双 DNA 病毒编码(几乎)完整的复制系统,而另一些则缺乏(几乎)复制所需的所有基因,依赖于宿主复制机制。DNA 聚合酶(DNAP)构成 DNA 复制装置的核心。复制性 DNAP 分为 4 个不相关或远缘相关的家族(A-D),每个家族的蛋白质结构和序列通常高度保守。超过一半的双 DNA 病毒编码家族 A、B 或 C 的 DNAP。我们之前表明,Crassvirales 目中的多对密切相关的病毒编码不同家族的 DNAP。
在包括具有不同家族 DNAP 的噬菌体的长尾噬菌体综合进化树中,确定 DNAP 交换可能发生的噬菌体组为定义深度的子树。通过对大终止酶亚基系统发育树进行约束树分析来验证 DNAP 交换,并用 Mauve 对齐编码交换 DNAP 的噬菌体基因组。使用 AlphaFold2 预测发现的异常 DNAP 的结构。
我们在 Caudoviricetes 类中鉴定了另外四个长尾噬菌体群,其中 DNAP 显然多次发生交换,替代发生在家族 A 和 B 之间,或 A 和 C 之间,或同一家族内的不同亚家族之间。DNAP 交换总是“原位”发生,周围基因的组织没有变化。在几种情况下,DNAP 基因是密切相关噬菌体基因组之间差异较大的唯一区域,而在其他情况下,交换显然涉及编码噬菌体基因组复制中其他参与蛋白的相邻基因。此外,我们还鉴定了两个以前未检测到的高度分化的家族 A DNAP 群,它们在某些噬菌体基因组中与参与基因组复制的主要 DNAP 一起编码。
在不同家族长尾噬菌体的进化过程中,DNAP 基因被编码不同家族 DNAP 的基因取代发生了许多独立的事件,在某些情况下,导致非常密切相关的噬菌体编码不相关的 DNAP。DNAP 交换可能是由避免针对噬菌体 DNAP 的宿主抗病毒机制的选择驱动的,这些机制仍有待确定,和/或由对复制子不兼容的选择驱动。