hakuna AG, Zürich, Switzerland.
Department of Biology, University of Turku, Turku, Finland.
PLoS One. 2023 Nov 30;18(11):e0289693. doi: 10.1371/journal.pone.0289693. eCollection 2023.
Basic local-alignment search tool (BLAST) is a versatile and commonly used sequence analysis tool in bioinformatics. BLAST permits fast and flexible sequence similarity searches across nucleotide and amino acid sequences, leading to diverse applications such as protein domain identification, orthology searches, and phylogenetic annotation. Most BLAST implementations are command line tools which produce output as comma-separated values files. However, a portable, modular and embeddable implementation of a BLAST-like algorithm, is still missing from our toolbox. Here we present nsearch, a command line tool and C++11 library which provides BLAST-like functionality that can easily be embedded in any application. As an example of this portability we present Blaster which leverages nsearch to provide native BLAST-like functionality for the R programming language, as well as npysearch which provides similar functionality for Python. These packages permit embedding BLAST-like functionality into larger frameworks such as Shiny or Django applications. Benchmarks show that nsearch, npysearch, and Blaster are comparable in speed and accuracy to other commonly used modern BLAST implementations such as VSEARCH and BLAST+. We envision similar implementations of nsearch for other languages commonly used in data science such as Julia to facilitate sequence similarity comparisons. Nsearch, Blaster and npysearch are free to use under the BSD 3.0 license and available on Github Conda, CRAN (Blaster) and PyPi (npysearch).
基本局部比对搜索工具(BLAST)是生物信息学中一种通用且常用的序列分析工具。BLAST 允许在核苷酸和氨基酸序列上进行快速灵活的序列相似性搜索,从而实现了多种应用,如蛋白质结构域识别、同源搜索和系统发育注释。大多数 BLAST 实现都是命令行工具,其输出为逗号分隔值文件。然而,我们的工具包中仍然缺少类似 BLAST 的算法的可移植、模块化和可嵌入的实现。
这里我们介绍 nsearch,这是一个命令行工具和 C++11 库,提供了类似 BLAST 的功能,可以轻松嵌入到任何应用程序中。作为这种可移植性的一个示例,我们介绍了 Blaster,它利用 nsearch 为 R 编程语言提供了本地 BLAST 功能,以及 npysearch,它为 Python 提供了类似的功能。这些包允许将类似 BLAST 的功能嵌入到更大的框架中,如 Shiny 或 Django 应用程序。
基准测试表明,nsearch、npysearch 和 Blaster 在速度和准确性方面与其他常用的现代 BLAST 实现(如 VSEARCH 和 BLAST+)相当。我们设想为数据科学中常用的其他语言(如 Julia)实现类似的 nsearch,以促进序列相似性比较。nsearch、Blaster 和 npysearch 在 BSD 3.0 许可证下免费使用,并可在 Github Conda、CRAN(Blaster)和 PyPi(npysearch)上获得。