RNA Bioinformatics and High-Throughput Analysis, Friedrich Schiller University Jena, 07743 Jena, Germany.
European Virus Bioinformatics Center, Friedrich Schiller University Jena, 07743 Jena, Germany.
Genome Res. 2019 Sep;29(9):1545-1554. doi: 10.1101/gr.247064.118. Epub 2019 Aug 22.
Sequence analyses of RNA virus genomes remain challenging owing to the exceptional genetic plasticity of these viruses. Because of high mutation and recombination rates, genome replication by viral RNA-dependent RNA polymerases leads to populations of closely related viruses, so-called "quasispecies." Standard (short-read) sequencing technologies are ill-suited to reconstruct large numbers of full-length haplotypes of (1) RNA virus genomes and (2) subgenome-length (sg) RNAs composed of noncontiguous genome regions. Here, we used a full-length, direct RNA sequencing (DRS) approach based on nanopores to characterize viral RNAs produced in cells infected with a human coronavirus. By using DRS, we were able to map the longest (∼26-kb) contiguous read to the viral reference genome. By combining Illumina and Oxford Nanopore sequencing, we reconstructed a highly accurate consensus sequence of the human coronavirus (HCoV)-229E genome (27.3 kb). Furthermore, by using long reads that did not require an assembly step, we were able to identify, in infected cells, diverse and novel HCoV-229E sg RNAs that remain to be characterized. Also, the DRS approach, which circumvents reverse transcription and amplification of RNA, allowed us to detect methylation sites in viral RNAs. Our work paves the way for haplotype-based analyses of viral quasispecies by showing the feasibility of intra-sample haplotype separation. Even though several technical challenges remain to be addressed to exploit the potential of the nanopore technology fully, our work illustrates that DRS may significantly advance genomic studies of complex virus populations, including predictions on long-range interactions in individual full-length viral RNA haplotypes.
由于这些病毒具有特殊的遗传可塑性,RNA 病毒基因组的序列分析仍然具有挑战性。由于高突变和重组率,病毒 RNA 依赖性 RNA 聚合酶的基因组复制导致密切相关的病毒群体,即所谓的“准种”。标准(短读长)测序技术不适合重建大量(1)RNA 病毒基因组和(2)由非连续基因组区域组成的亚基因组长度(sg)RNAs 的全长单倍型。在这里,我们使用基于纳米孔的全长直接 RNA 测序(DRS)方法来描述感染人类冠状病毒的细胞中产生的病毒 RNA。通过使用 DRS,我们能够将最长(约 26kb)连续读数映射到病毒参考基因组上。通过将 Illumina 和牛津纳米孔测序相结合,我们重建了人类冠状病毒(HCoV)-229E 基因组(27.3kb)的高度准确的共识序列。此外,通过使用不需要组装步骤的长读长,我们能够在感染细胞中鉴定出不同的、新颖的 HCoV-229E sgRNAs,这些 sgRNAs有待进一步研究。此外,DRS 方法绕过了 RNA 的反转录和扩增,使我们能够检测到病毒 RNA 中的甲基化位点。我们的工作为基于单倍型的病毒准种分析铺平了道路,展示了在样本内单倍型分离方面的可行性。尽管还有一些技术挑战需要解决,以充分利用纳米孔技术的潜力,但我们的工作表明,DRS 可能会极大地推进复杂病毒群体的基因组研究,包括对个体全长病毒 RNA 单倍型中长程相互作用的预测。