PLAST：用于数据库比较的并行局部比对搜索工具。

PLAST: parallel local alignment search tool for database comparison.

机构信息

INRIA/IRISA, Campus de Beaulieu, 35042 Rennes Cedex, France.

出版信息

BMC Bioinformatics. 2009 Oct 12;10:329. doi: 10.1186/1471-2105-10-329.

DOI:10.1186/1471-2105-10-329

PMID:19821978

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2770072/

Abstract

BACKGROUND

Sequence similarity searching is an important and challenging task in molecular biology and next-generation sequencing should further strengthen the need for faster algorithms to process such vast amounts of data. At the same time, the internal architecture of current microprocessors is tending towards more parallelism, leading to the use of chips with two, four and more cores integrated on the same die. The main purpose of this work was to design an effective algorithm to fit with the parallel capabilities of modern microprocessors.

RESULTS

A parallel algorithm for comparing large genomic banks and targeting middle-range computers has been developed and implemented in PLAST software. The algorithm exploits two key parallel features of existing and future microprocessors: the SIMD programming model (SSE instruction set) and the multithreading concept (multicore). Compared to multithreaded BLAST software, tests performed on an 8-processor server have shown speedup ranging from 3 to 6 with a similar level of accuracy.

CONCLUSION

A parallel algorithmic approach driven by the knowledge of the internal microprocessor architecture allows significant speedup to be obtained while preserving standard sensitivity for similarity search problems.

摘要

背景

序列相似性搜索是分子生物学中的一项重要且具有挑战性的任务，而新一代测序技术应该进一步加强对更快算法的需求，以处理如此庞大的数据量。与此同时，当前微处理器的内部架构正趋向于更高的并行性，从而导致使用具有集成在同一芯片上的两个、四个甚至更多内核的芯片。这项工作的主要目的是设计一种有效的算法，以适应现代微处理器的并行能力。

结果

开发并实现了一种针对大型基因组库并针对中端计算机的并行算法，该算法利用了现有和未来微处理器的两个关键并行特性：SIMD 编程模型（SSE 指令集）和多线程概念（多核）。与多线程 BLAST 软件相比，在一个具有 8 个处理器的服务器上进行的测试表明，速度提高了 3 到 6 倍，而相似性搜索问题的准确性保持不变。

结论

由对内部微处理器架构的了解驱动的并行算法方法允许获得显著的速度提升，同时保持相似性搜索问题的标准灵敏度。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/25cf/2770072/ec8c6f492819/1471-2105-10-329-1.jpg

相似文献

PLAST: parallel local alignment search tool for database comparison.

BMC Bioinformatics. 2009 Oct 12;10:329. doi: 10.1186/1471-2105-10-329.

ParAlign: a parallel sequence alignment algorithm for rapid and sensitive database searches.

Nucleic Acids Res. 2001 Apr 1;29(7):1647-52. doi: 10.1093/nar/29.7.1647.

SS-Wrapper: a package of wrapper applications for similarity searches on Linux clusters.

BMC Bioinformatics. 2004 Oct 28;5:171. doi: 10.1186/1471-2105-5-171.

Six-fold speed-up of Smith-Waterman sequence database searches using parallel processing on common microprocessors.

Bioinformatics. 2000 Aug;16(8):699-706. doi: 10.1093/bioinformatics/16.8.699.

muBLASTP: database-indexed protein sequence search on multicore CPUs.

BMC Bioinformatics. 2016 Nov 4;17(1):443. doi: 10.1186/s12859-016-1302-4.

A table-driven, full-sensitivity similarity search algorithm.

J Comput Biol. 2003;10(2):103-17. doi: 10.1089/106652703321825919.

A multithreaded parallel implementation of a dynamic programming algorithm for sequence comparison.

Pac Symp Biocomput. 2001:311-22. doi: 10.1142/9789814447362_0031.

Next-generation bioinformatics: using many-core processor architecture to develop a web service for sequence alignment.

Bioinformatics. 2010 Mar 1;26(5):683-6. doi: 10.1093/bioinformatics/btq017. Epub 2010 Jan 16.

Optimised fine and coarse parallelism for sequence homology search.

Int J Bioinform Res Appl. 2006;2(4):430-41. doi: 10.1504/IJBRA.2006.011041.

Massively Parallel Implementation of Sequence Alignment with Basic Local Alignment Search Tool Using Parallel Computing in Java Library.

J Comput Biol. 2018 Aug;25(8):871-881. doi: 10.1089/cmb.2018.0079. Epub 2018 Jul 13.

引用本文的文献

From farm to field: testing different biocrust cultivation approaches and application techniques in the Sonoran Desert.

Restor Ecol. 2025 Jun 1. doi: 10.1111/rec.70098.

Dancing with the Dust Devil: Examining the Lung Mycobiome of Sonoran Desert Wild Mammals and the Effect of Presence.

Pathogens. 2025 Aug 14;14(8):807. doi: 10.3390/pathogens14080807.

Eukfinder: a pipeline to retrieve microbial eukaryote genome sequences from metagenomic data.

mBio. 2025 May 14;16(5):e0069925. doi: 10.1128/mbio.00699-25. Epub 2025 Apr 10.

Transcriptomics of mussel transmissible cancer MtrBTN2 suggests accumulation of multiple cancer traits and oncogenic pathways shared among bilaterians.

Open Biol. 2023 Oct;13(10):230259. doi: 10.1098/rsob.230259. Epub 2023 Oct 11.

Group-specific functional patterns of mitochondrion-related organelles shed light on their multiple transitions from mitochondria in ciliated protists.

Mar Life Sci Technol. 2022 Nov 21;4(4):609-623. doi: 10.1007/s42995-022-00147-w. eCollection 2022 Nov.

Draft genomes of Blastocystis subtypes from human samples of Colombia.

Parasit Vectors. 2023 Feb 2;16(1):52. doi: 10.1186/s13071-022-05619-7.

Reference quality genome sequence of Indian pomegranate cv. 'Bhagawa' ( L.).

Front Plant Sci. 2022 Sep 15;13:947164. doi: 10.3389/fpls.2022.947164. eCollection 2022.

DNA polymerase zeta contributes to heterochromatin replication to prevent genome instability.

EMBO J. 2021 Nov 2;40(21):e104543. doi: 10.15252/embj.2020104543. Epub 2021 Sep 17.

Ecological Specialization and Evolutionary Reticulation in Extant Hyaenidae.

Mol Biol Evol. 2021 Aug 23;38(9):3884-3897. doi: 10.1093/molbev/msab055.

Re-examination of two diatom reference genomes using long-read sequencing.

BMC Genomics. 2021 May 24;22(1):379. doi: 10.1186/s12864-021-07666-3.

本文引用的文献

On subset seeds for protein alignment.

IEEE/ACM Trans Comput Biol Bioinform. 2009 Jul-Sep;6(3):483-94. doi: 10.1109/TCBB.2009.4.

Bioinformatics challenges of new sequencing technology.

Trends Genet. 2008 Mar;24(3):142-9. doi: 10.1016/j.tig.2007.12.006. Epub 2008 Feb 11.

Composition-based statistics and translated nucleotide searches: improving the TBLASTN module of BLAST.

BMC Biol. 2006 Dec 7;4:41. doi: 10.1186/1741-7007-4-41.

Striped Smith-Waterman speeds database searches six times over other SIMD implementations.

Bioinformatics. 2007 Jan 15;23(2):156-61. doi: 10.1093/bioinformatics/btl582. Epub 2006 Nov 16.

Retrieval accuracy, statistical significance and compositional similarity in protein sequence database searches.

Nucleic Acids Res. 2006;34(20):5966-73. doi: 10.1093/nar/gkl731. Epub 2006 Oct 26.

The construction of amino acid substitution matrices for the comparison of proteins with non-standard compositions.

Bioinformatics. 2005 Apr 1;21(7):902-11. doi: 10.1093/bioinformatics/bti070. Epub 2004 Oct 27.

The ASTRAL Compendium in 2004.

Nucleic Acids Res. 2004 Jan 1;32(Database issue):D189-92. doi: 10.1093/nar/gkh034.

Six-fold speed-up of Smith-Waterman sequence database searches using parallel processing on common microprocessors.

Bioinformatics. 2000 Aug;16(8):699-706. doi: 10.1093/bioinformatics/16.8.699.

Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.

Nucleic Acids Res. 1997 Sep 1;25(17):3389-402. doi: 10.1093/nar/25.17.3389.

Local alignment statistics.

Methods Enzymol. 1996;266:460-80. doi: 10.1016/s0076-6879(96)66029-7.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

PLAST：用于数据库比较的并行局部比对搜索工具。

PLAST: parallel local alignment search tool for database comparison.

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSION

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献