Suppr超能文献

变异进化图:我们能否推断出严重急性呼吸综合征冠状病毒2(SARS-CoV-2)变体是如何进化的?

Variant evolution graph: Can we infer how SARS-CoV-2 variants are evolving?

作者信息

Das Badhan, Heath Lenwood S

机构信息

Department of Computer Science, Virginia Polytechnic Institute and State University, Blacksburg, Virginia, United States of America.

出版信息

PLoS One. 2025 Jun 9;20(6):e0323970. doi: 10.1371/journal.pone.0323970. eCollection 2025.

Abstract

The SARS-CoV-2 virus has undergone extensive mutations over time, resulting in considerable genetic diversity among circulating strains. This diversity directly affects important viral characteristics, such as transmissibility and disease severity. During a viral outbreak, the rapid mutation rate produces a large cloud of variants, referred to as a viral quasispecies. However, many variants are lost due to the bottleneck of transmission and survival. Advances in next-generation sequencing have enabled continuous and cost-effective monitoring of viral genomes, but constructing reliable phylogenetic trees from the vast collection of sequences in GISAID (the Global Initiative on Sharing All Influenza Data) presents significant challenges. We introduce a novel graph-based framework inspired by quasispecies theory, the Variant Evolution Graph (VEG), to model viral evolution. Unlike traditional phylogenetic trees, VEG accommodates multiple ancestors for each variant and maps all possible evolutionary pathways. The strongly connected subgraphs in the VEG reveal critical evolutionary patterns, including recombination events, mutation hotspots, and intra-host viral evolution, providing deeper insights into viral adaptation and spread. We also derive the Disease Transmission Network (DTN) from the VEG, which supports the inference of transmission pathways and super-spreaders among hosts. We have applied our method to genomic data sets from five arbitrarily selected countries - Somalia, Bhutan, Hungary, Iran, and Nepal. Our study compares three methods for computing mutational distances to build the VEG, sourmash, pyani, and edit distance, with the phylogenetic approach using Maximum Likelihood (ML). Among these, ML is the most computationally intensive, requiring multiple sequence alignment and probabilistic inference, making it the slowest. In contrast, sourmash is the fastest, followed by the edit distance approach, while pyani takes more time due to its BLAST-based computations. This comparison highlights the computational efficiency of VEG, making it a scalable alternative for analyzing large viral data sets.

摘要

随着时间的推移,严重急性呼吸综合征冠状病毒2(SARS-CoV-2)病毒发生了广泛的突变,导致流行毒株之间存在相当大的遗传多样性。这种多样性直接影响重要的病毒特征,如传播性和疾病严重程度。在病毒爆发期间,快速的突变率产生了大量的变异体云团,称为病毒准种。然而,由于传播和生存的瓶颈,许多变异体消失了。下一代测序技术的进步使得对病毒基因组进行持续且经济高效的监测成为可能,但从全球共享流感数据倡议组织(GISAID)中大量的序列构建可靠的系统发育树面临重大挑战。我们引入了一种受准种理论启发的基于图的新框架——变异进化图(VEG),以对病毒进化进行建模。与传统的系统发育树不同,VEG为每个变异体容纳多个祖先,并描绘了所有可能的进化途径。VEG中的强连通子图揭示了关键的进化模式,包括重组事件、突变热点和宿主内病毒进化,为病毒的适应性和传播提供了更深入的见解。我们还从VEG中推导出疾病传播网络(DTN),它支持推断宿主之间的传播途径和超级传播者。我们将我们的方法应用于从五个任意选择的国家——索马里、不丹、匈牙利、伊朗和尼泊尔获取的基因组数据集。我们的研究比较了三种计算突变距离以构建VEG的方法,即sourmash、pyani和编辑距离,以及使用最大似然法(ML)的系统发育方法。其中,ML计算量最大且最耗时,需要进行多序列比对和概率推断,是最慢的。相比之下,sourmash最快,其次是编辑距离方法,而pyani由于基于BLAST的计算而耗时更多。这种比较突出了VEG的计算效率,使其成为分析大型病毒数据集的可扩展替代方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1cdd/12148141/872bef67cc2f/pone.0323970.g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验