Suppr超能文献

使用结构化合并模型对病原体系统地理学进行贝叶斯推断

Bayesian Inference of Pathogen Phylogeography using the Structured Coalescent Model.

作者信息

Roberts Ian, Everitt Richard G, Koskela Jere, Didelot Xavier

机构信息

Department of Statistics, University of Warwick, Coventry, United Kingdom.

Zeeman Institute for Systems Biology and Infectious Disease Epidemiology Research (SBIDER), University of Warwick, Coventry, United Kingdom.

出版信息

PLoS Comput Biol. 2025 Apr 21;21(4):e1012995. doi: 10.1371/journal.pcbi.1012995. eCollection 2025 Apr.

Abstract

Over the past decade, pathogen genome sequencing has become well established as a powerful approach to study infectious disease epidemiology. In particular, when multiple genomes are available from several geographical locations, comparing them is informative about the relative size of the local pathogen populations as well as past migration rates and events between locations. The structured coalescent model has a long history of being used as the underlying process for such phylogeographic analysis. However, the computational cost of using this model does not scale well to the large number of genomes frequently analysed in pathogen genomic epidemiology studies. Several approximations of the structured coalescent model have been proposed, but their effects are difficult to predict. Here we show how the exact structured coalescent model can be used to analyse a precomputed dated phylogeny, in order to perform Bayesian inference on the past migration history, the effective population sizes in each location, and the directed migration rates from any location to another. We describe an efficient reversible jump Markov Chain Monte Carlo scheme which is implemented in a new R package StructCoalescent. We use simulations to demonstrate the scalability and correctness of our method and to compare it with existing software. We also applied our new method to several state-of-the-art datasets on the population structure of real pathogens to showcase the relevance of our method to current data scales and research questions.

摘要

在过去十年中,病原体基因组测序已成为研究传染病流行病学的一种成熟且强大的方法。特别是,当从多个地理位置获得多个基因组时,对它们进行比较有助于了解当地病原体种群的相对规模以及过去不同地点之间的迁移率和迁移事件。结构化合并模型长期以来一直被用作此类系统发育地理学分析的基础过程。然而,使用该模型的计算成本对于病原体基因组流行病学研究中经常分析的大量基因组而言,扩展性不佳。已经提出了几种结构化合并模型的近似方法,但其效果难以预测。在此,我们展示了如何使用精确的结构化合并模型来分析预先计算好的带时间信息的系统发育树,以便对过去的迁移历史、每个地点的有效种群大小以及从任何一个地点到另一个地点的定向迁移率进行贝叶斯推断。我们描述了一种高效的可逆跳跃马尔可夫链蒙特卡罗方法,该方法在一个新的R包StructCoalescent中实现。我们通过模拟来证明我们方法的可扩展性和正确性,并将其与现有软件进行比较。我们还将我们的新方法应用于几个关于真实病原体种群结构的最新数据集,以展示我们的方法与当前数据规模和研究问题的相关性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5237/12040344/99196b5fe3d6/pcbi.1012995.g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验