Suppr超能文献

隐马尔可夫模型速度启发式和迭代隐马尔可夫模型搜索过程。

Hidden Markov model speed heuristic and iterative HMM search procedure.

机构信息

Department of Immunology and Pathology, Washington University School of Medicine, St Louis, Missouri, USA.

出版信息

BMC Bioinformatics. 2010 Aug 18;11:431. doi: 10.1186/1471-2105-11-431.

Abstract

BACKGROUND

Profile hidden Markov models (profile-HMMs) are sensitive tools for remote protein homology detection, but the main scoring algorithms, Viterbi or Forward, require considerable time to search large sequence databases.

RESULTS

We have designed a series of database filtering steps, HMMERHEAD, that are applied prior to the scoring algorithms, as implemented in the HMMER package, in an effort to reduce search time. Using this heuristic, we obtain a 20-fold decrease in Forward and a 6-fold decrease in Viterbi search time with a minimal loss in sensitivity relative to the unfiltered approaches. We then implemented an iterative profile-HMM search method, JackHMMER, which employs the HMMERHEAD heuristic. Due to our search heuristic, we eliminated the subdatabase creation that is common in current iterative profile-HMM approaches. On our benchmark, JackHMMER detects 14% more remote protein homologs than SAM's iterative method T2K.

CONCLUSIONS

Our search heuristic, HMMERHEAD, significantly reduces the time needed to score a profile-HMM against large sequence databases. This search heuristic allowed us to implement an iterative profile-HMM search method, JackHMMER, which detects significantly more remote protein homologs than SAM's T2K and NCBI's PSI-BLAST.

摘要

背景

轮廓隐马尔可夫模型(profile-HMM)是用于远程蛋白质同源性检测的敏感工具,但主要的评分算法(Viterbi 或 Forward)在搜索大型序列数据库时需要相当长的时间。

结果

我们设计了一系列数据库过滤步骤,HMMERHEAD,在评分算法(如 HMMER 包中实现的算法)之前应用,以努力减少搜索时间。使用这种启发式方法,我们相对于未过滤的方法,在保持最小敏感性损失的前提下,将 Forward 和 Viterbi 的搜索时间分别减少了 20 倍和 6 倍。然后,我们实现了一种迭代轮廓 HMM 搜索方法 JackHMMER,它采用了 HMMERHEAD 启发式方法。由于我们的搜索启发式方法,我们消除了当前迭代轮廓 HMM 方法中常见的子数据库创建。在我们的基准测试中,JackHMMER 比 SAM 的迭代方法 T2K 检测到 14%更多的远程蛋白质同源物。

结论

我们的搜索启发式方法 HMMERHEAD 显著减少了对大型序列数据库进行轮廓 HMM 评分所需的时间。这种搜索启发式方法使我们能够实现一种迭代轮廓 HMM 搜索方法 JackHMMER,它比 SAM 的 T2K 和 NCBI 的 PSI-BLAST 检测到更多的远程蛋白质同源物。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6480/2931519/1c2e80a3d0b3/1471-2105-11-431-1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验