Suppr超能文献

dissectHMMER:一个基于HMMER的得分剖析框架,用于对结构域折叠相似性的折叠关键序列片段进行统计评估。

dissectHMMER: a HMMER-based score dissection framework that statistically evaluates fold-critical sequence segments for domain fold similarity.

作者信息

Wong Wing-Cheong, Yap Choon-Kong, Eisenhaber Birgit, Eisenhaber Frank

机构信息

Bioinformatics Institute (BII), Agency for Science, Technology and Research (A*STAR), 30 Biopolis Street, #07-01, Matrix, Singapore, 138671, Singapore.

Department of Biological Sciences (DBS), National University of Singapore (NUS), 8 Medical Drive, Singapore, 117597, Singapore.

出版信息

Biol Direct. 2015 Aug 1;10:39. doi: 10.1186/s13062-015-0068-3.

Abstract

BACKGROUND

Annotation transfer for function and structure within the sequence homology concept essentially requires protein sequence similarity for the secondary structural blocks forming the fold of a protein. A simplistic similarity approach in the case of non-globular segments (coiled coils, low complexity regions, transmembrane regions, long loops, etc.) is not justified and a pertinent source for mistaken homologies. The latter is either due to positional sequence conservation as a result of a very simple, physically induced pattern or integral sequence properties that are critical for function. Furthermore, against the backdrop that the number of well-studied proteins continues to grow at a slow rate, it necessitates for a search methodology to dive deeper into the sequence similarity space to connect the unknown sequences to the well-studied ones, albeit more distant, for biological function postulations.

RESULTS

Based on our previous work of dissecting the hidden markov model (HMMER) based similarity score into fold-critical and the non-globular contributions to improve homology inference, we propose a framework-dissectHMMER, that identifies more fold-related domain hits from standard HMMER searches. Subsequent statistical stratification of the fold-related hits into cohorts of functionally-related domains allows for the function postulation of the query sequence. Briefly, the technical problems as to how to recognize non-globular parts in the domain model, resolve contradictory HMMER2/HMMER3 results and evaluate fold-related domain hits for homology, are addressed in this work. The framework is benchmarked against a set of SCOP-to-Pfam domain models. Despite being a sequence-to-profile method, dissectHMMER performs favorably against a profile-to-profile based method-HHsuite/HHsearch. Examples of function annotation using dissectHMMER, including the function discovery of an uncharacterized membrane protein Q9K8K1_BACHD (WP_010899149.1) as a lactose/H+ symporter, are presented. Finally, dissectHMMER webserver is made publicly available at http://dissecthmmer.bii.a-star.edu.sg .

CONCLUSIONS

The proposed framework-dissectHMMER, is faithful to the original inception of the sequence homology concept while improving upon the existing HMMER search tool through the rescue of statistically evaluated false-negative yet fold-related domain hits to the query sequence. Overall, this translates into an opportunity for any novel protein sequence to be functionally characterized.

摘要

背景

在序列同源性概念下进行功能和结构的注释转移,本质上要求构成蛋白质折叠的二级结构模块具有蛋白质序列相似性。对于非球状片段(卷曲螺旋、低复杂性区域、跨膜区域、长环等)采用简单的相似性方法是不合理的,且是错误同源性的一个相关来源。后者要么是由于非常简单的物理诱导模式导致的位置序列保守性,要么是对功能至关重要的整体序列特性。此外,鉴于经过充分研究的蛋白质数量增长缓慢,需要一种搜索方法更深入地挖掘序列相似性空间,以便将未知序列与研究充分的序列联系起来,尽管它们的亲缘关系较远,从而进行生物学功能推测。

结果

基于我们之前将基于隐马尔可夫模型(HMMER)的相似性得分分解为折叠关键部分和非球状部分以改进同源性推断的工作,我们提出了一个框架——dissectHMMER,它能从标准的HMMER搜索中识别出更多与折叠相关的结构域命中。随后将与折叠相关的命中结果进行统计分层,分为功能相关结构域的群组,从而可以对查询序列进行功能推测。简而言之,这项工作解决了如何在结构域模型中识别非球状部分、解决HMMER2/HMMER3结果矛盾以及评估与折叠相关的结构域命中的同源性等技术问题。该框架以一组SCOP到Pfam的结构域模型为基准进行测试。尽管dissectHMMER是一种序列到图谱的方法,但与基于图谱到图谱的方法HHsuite/HHsearch相比,它表现出色。展示了使用dissectHMMER进行功能注释的例子,包括发现一种未表征的膜蛋白Q9K8K1_BACHD(WP_010899149.1)作为乳糖/H+同向转运体的功能。最后,dissectHMMER网络服务器已在http://dissecthmmer.bii.a-star.edu.sg上公开提供。

结论

所提出的框架dissectHMMER忠实于序列同源性概念的最初设想,同时通过挽救对查询序列经统计评估的假阴性但与折叠相关的结构域命中结果,对现有的HMMER搜索工具进行了改进。总体而言,这为任何新的蛋白质序列进行功能表征提供了机会。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d693/4521371/d6757a7996aa/13062_2015_68_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验