BAPS软件中用于学习群体遗传结构的增强贝叶斯建模。

Enhanced Bayesian modelling in BAPS software for learning genetic structures of populations.

作者信息

Corander Jukka, Marttinen Pekka, Sirén Jukka, Tang Jing

机构信息

Department of Mathematics, Fänriksgatan 3B, Abo Akademi University, Abo, Finland.

出版信息

BMC Bioinformatics. 2008 Dec 16;9:539. doi: 10.1186/1471-2105-9-539.

DOI:10.1186/1471-2105-9-539

PMID:19087322

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2629778/

Abstract

BACKGROUND

During the most recent decade many Bayesian statistical models and software for answering questions related to the genetic structure underlying population samples have appeared in the scientific literature. Most of these methods utilize molecular markers for the inferences, while some are also capable of handling DNA sequence data. In a number of earlier works, we have introduced an array of statistical methods for population genetic inference that are implemented in the software BAPS. However, the complexity of biological problems related to genetic structure analysis keeps increasing such that in many cases the current methods may provide either inappropriate or insufficient solutions.

RESULTS

We discuss the necessity of enhancing the statistical approaches to face the challenges posed by the ever-increasing amounts of molecular data generated by scientists over a wide range of research areas and introduce an array of new statistical tools implemented in the most recent version of BAPS. With these methods it is possible, e.g., to fit genetic mixture models using user-specified numbers of clusters and to estimate levels of admixture under a genetic linkage model. Also, alleles representing a different ancestry compared to the average observed genomic positions can be tracked for the sampled individuals, and a priori specified hypotheses about genetic population structure can be directly compared using Bayes' theorem. In general, we have improved further the computational characteristics of the algorithms behind the methods implemented in BAPS facilitating the analyses of large and complex datasets. In particular, analysis of a single dataset can now be spread over multiple computers using a script interface to the software.

CONCLUSION

The Bayesian modelling methods introduced in this article represent an array of enhanced tools for learning the genetic structure of populations. Their implementations in the BAPS software are designed to meet the increasing need for analyzing large-scale population genetics data. The software is freely downloadable for Windows, Linux and Mac OS X systems at http://web.abo.fi/fak/mnf//mate/jc/software/baps.html.

摘要

背景

在最近十年中，科学文献中出现了许多用于回答与群体样本潜在遗传结构相关问题的贝叶斯统计模型和软件。这些方法大多利用分子标记进行推断，而有些方法也能够处理DNA序列数据。在许多早期的研究中，我们已经介绍了一系列用于群体遗传推断的统计方法，这些方法在软件BAPS中得以实现。然而，与遗传结构分析相关的生物学问题的复杂性不断增加，以至于在许多情况下，当前的方法可能提供不合适或不充分的解决方案。

结果

我们讨论了增强统计方法以应对科学家在广泛研究领域中产生的不断增加的分子数据所带来的挑战的必要性，并介绍了在最新版本的BAPS中实现的一系列新统计工具。使用这些方法，例如，可以使用用户指定的聚类数量来拟合遗传混合模型，并在遗传连锁模型下估计混合水平。此外，可以为抽样个体追踪与平均观察到的基因组位置相比代表不同祖先的等位基因，并且可以使用贝叶斯定理直接比较关于遗传群体结构的先验指定假设。总体而言，我们进一步改进了BAPS中实现的方法背后算法的计算特性，便于分析大型和复杂的数据集。特别是，现在可以使用该软件的脚本接口将单个数据集的分析分布在多台计算机上。

结论

本文介绍的贝叶斯建模方法代表了一系列用于了解群体遗传结构的增强工具。它们在BAPS软件中的实现旨在满足分析大规模群体遗传学数据日益增长的需求。该软件可从http://web.abo.fi/fak/mnf//mate/jc/software/baps.html免费下载到Windows、Linux和Mac OS X系统。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bf7d/2629778/cfd9db0ad097/1471-2105-9-539-1.jpg

相似文献

Enhanced Bayesian modelling in BAPS software for learning genetic structures of populations.

BMC Bioinformatics. 2008 Dec 16;9:539. doi: 10.1186/1471-2105-9-539.

Bayesian semi-supervised classification of bacterial samples using MLST databases.

BMC Bioinformatics. 2011 Jul 26;12:302. doi: 10.1186/1471-2105-12-302.

Identifying currents in the gene pool for bacterial populations using an integrative approach.

PLoS Comput Biol. 2009 Aug;5(8):e1000455. doi: 10.1371/journal.pcbi.1000455. Epub 2009 Aug 7.

Bayesian identification of admixture events using multilocus molecular markers.

Mol Ecol. 2006 Sep;15(10):2833-43. doi: 10.1111/j.1365-294X.2006.02994.x.

T-BAPS: a Bayesian statistical tool for comparison of microbial communities using terminal-restriction fragment length polymorphism (T-RFLP) data.

Stat Appl Genet Mol Biol. 2007;6:Article30. doi: 10.2202/1544-6115.1303. Epub 2007 Nov 6.

BAPS 2: enhanced possibilities for the analysis of genetic population structure.

Bioinformatics. 2004 Oct 12;20(15):2363-9. doi: 10.1093/bioinformatics/bth250. Epub 2004 Apr 8.

Bayesian clustering and feature selection for cancer tissue samples.

BMC Bioinformatics. 2009 Mar 18;10:90. doi: 10.1186/1471-2105-10-90.

Assessing population genetic structure via the maximisation of genetic distance.

Genet Sel Evol. 2009 Nov 9;41(1):49. doi: 10.1186/1297-9686-41-49.

Hierarchical and spatially explicit clustering of DNA sequences with BAPS software.

Mol Biol Evol. 2013 May;30(5):1224-8. doi: 10.1093/molbev/mst028. Epub 2013 Feb 13.

Improving the inference of population genetic structure in the presence of related individuals.

Genet Res (Camb). 2014;96:e003. doi: 10.1017/S0016672314000068.

引用本文的文献

Exploring the mitochondrial DNA ancestry of patients with type 1 diabetes from an admixed population of the Northeast of Brazil.

Sci Rep. 2025 Jul 1;15(1):21659. doi: 10.1038/s41598-025-05990-2.

Molecular epidemiology of Ascaris lumbricoides following multiple rounds of community-wide treatment.

Nat Commun. 2025 May 9;16(1):4321. doi: 10.1038/s41467-025-59316-x.

The More the Better: Genetic Monitoring of (Lamarck, 1816) Experimental Restockings in Sardinia (Western Mediterranean Sea).

Animals (Basel). 2025 Feb 14;15(4):554. doi: 10.3390/ani15040554.

Phylogeography, taxonomy, and conservation of the endangered brown howler monkey, (Primates, Atelidae), of the Atlantic Forest.

Front Genet. 2024 Dec 3;15:1453005. doi: 10.3389/fgene.2024.1453005. eCollection 2024.

Mitochondrial DNA patterns describe the evolutionary history of the bonnethead shark Sphyrna tiburo (Linneus 1758) complex in the western Atlantic Ocean.

J Fish Biol. 2025 Feb;106(2):403-419. doi: 10.1111/jfb.15961. Epub 2024 Oct 15.

Genetic Structure and Phylogeographic Divergence of in the Ob-Irtysh River Headwaters.

Ecol Evol. 2024 Oct 11;14(10):e70422. doi: 10.1002/ece3.70422. eCollection 2024 Oct.

Genetic diversity of a recovering European roller (Coracias garrulus) population from Serbia.

PLoS One. 2024 Aug 8;19(8):e0308066. doi: 10.1371/journal.pone.0308066. eCollection 2024.

Limited intraspecific variation in drought resistance along a pronounced tropical rainfall gradient.

Proc Natl Acad Sci U S A. 2024 Jun 4;121(23):e2316971121. doi: 10.1073/pnas.2316971121. Epub 2024 May 29.

Genetic analysis of harvest samples reveals population structure in a highly mobile generalist carnivore.

Ecol Evol. 2024 May 23;14(5):e11411. doi: 10.1002/ece3.11411. eCollection 2024 May.

Core genome multilocus sequence typing (cgMLST) applicable to the monophyletic species complex.

J Clin Microbiol. 2024 Jun 12;62(6):e0172523. doi: 10.1128/jcm.01725-23. Epub 2024 May 23.

本文引用的文献

Bayesian clustering of fuzzy feature vectors using a quasi-likelihood approach.

IEEE Trans Pattern Anal Mach Intell. 2009 Jan;31(1):74-85. doi: 10.1109/TPAMI.2008.53.

Bayesian modeling of recombination events in bacterial populations.

BMC Bioinformatics. 2008 Oct 7;9:421. doi: 10.1186/1471-2105-9-421.

Inference of structure in subdivided populations at low levels of genetic differentiation--the correlated allele frequencies model revisited.

Bioinformatics. 2008 Oct 1;24(19):2222-8. doi: 10.1093/bioinformatics/btn419. Epub 2008 Aug 18.

Analysing georeferenced population genetics data with Geneland: a new algorithm to deal with null alleles and a friendly graphical user interface.

Bioinformatics. 2008 Jun 1;24(11):1406-7. doi: 10.1093/bioinformatics/btn136. Epub 2008 Apr 14.

T-BAPS: a Bayesian statistical tool for comparison of microbial communities using terminal-restriction fragment length polymorphism (T-RFLP) data.

Stat Appl Genet Mol Biol. 2007;6:Article30. doi: 10.2202/1544-6115.1303. Epub 2007 Nov 6.

Genetic variation and population structure in native Americans.

PLoS Genet. 2007 Nov;3(11):e185. doi: 10.1371/journal.pgen.0030185.

Estimating genealogies from unlinked marker data: a Bayesian approach.

Theor Popul Biol. 2007 Nov;72(3):305-22. doi: 10.1016/j.tpb.2007.06.004. Epub 2007 Jun 22.

A Markov chain Monte Carlo approach for joint inference of population structure and inbreeding rates from multilocus genotype data.

Genetics. 2007 Jul;176(3):1635-51. doi: 10.1534/genetics.107.072371. Epub 2007 May 4.

Inference of population structure under a Dirichlet process model.

Genetics. 2007 Apr;175(4):1787-802. doi: 10.1534/genetics.106.061317. Epub 2007 Jan 21.

Bayesian analysis of population structure based on linked molecular information.

Math Biosci. 2007 Jan;205(1):19-31. doi: 10.1016/j.mbs.2006.09.015. Epub 2006 Sep 28.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

BAPS软件中用于学习群体遗传结构的增强贝叶斯建模。

Enhanced Bayesian modelling in BAPS software for learning genetic structures of populations.

作者信息

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSION

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献