Suppr超能文献

利用蛋白质组学方法的序列数据评估用于鉴定分枝杆菌信号肽的信号肽预测算法。

Evaluation of signal peptide prediction algorithms for identification of mycobacterial signal peptides using sequence data from proteomic methods.

作者信息

Leversen Nils Anders, de Souza Gustavo A, Målen Hiwa, Prasad Swati, Jonassen Inge, Wiker Harald G

机构信息

Section of Microbiology and Immunology, The Gade Institute, University of Bergen, N-5021 Bergen, Norway.

Department of Informatics and Computational Biology Unit, BCCS, University of Bergen, N-5020 Bergen, Norway.

出版信息

Microbiology (Reading). 2009 Jul;155(Pt 7):2375-2383. doi: 10.1099/mic.0.025270-0. Epub 2009 Apr 23.

Abstract

Secreted proteins play an important part in the pathogenicity of Mycobacterium tuberculosis, and are the primary source of vaccine and diagnostic candidates. A majority of these proteins are exported via the signal peptidase I-dependent pathway, and have a signal peptide that is cleaved off during the secretion process. Sequence similarities within signal peptides have spurred the development of several algorithms for predicting their presence as well as the respective cleavage sites. For proteins exported via this pathway, algorithms exist for eukaryotes, and for Gram-negative and Gram-positive bacteria. However, the unique structure of the mycobacterial membrane raises the question of whether the existing algorithms are suitable for predicting signal peptides within mycobacterial proteins. In this work, we have evaluated the performance of nine signal peptide prediction algorithms on a positive validation set, consisting of 57 proteins with a verified signal peptide and cleavage site, and a negative set, consisting of 61 proteins that have an N-terminal sequence that confirms the annotated translational start site. We found the hidden Markov model of SignalP v3.0 to be the best-performing algorithm for predicting the presence of a signal peptide in mycobacterial proteins. It predicted no false positives or false negatives, and predicted a correct cleavage site for 45 of the 57 proteins in the positive set. Based on these results, we used the hidden Markov model of SignalP v3.0 to analyse the 10 available annotated proteomes of mycobacterial species, including annotations of M. tuberculosis H37Rv from the Wellcome Trust Sanger Institute and the J. Craig Venter Institute (JCVI). When excluding proteins with transmembrane regions among the proteins predicted to harbour a signal peptide, we found between 7.8 and 10.5% of the proteins in the proteomes to be putative secreted proteins. Interestingly, we observed a consistent difference in the percentage of predicted proteins between the Sanger Institute and JCVI. We have determined the most valuable algorithm for predicting signal peptidase I-processed proteins of M. tuberculosis, and used this algorithm to estimate the number of mycobacterial proteins with the potential to be exported via this pathway.

摘要

分泌蛋白在结核分枝杆菌的致病性中起着重要作用,并且是疫苗和诊断候选物的主要来源。这些蛋白中的大多数通过信号肽酶I依赖性途径输出,并且具有在分泌过程中被切割掉的信号肽。信号肽内的序列相似性推动了几种用于预测其存在以及各自切割位点的算法的发展。对于通过该途径输出的蛋白质,存在针对真核生物以及革兰氏阴性和革兰氏阳性细菌的算法。然而,分枝杆菌膜的独特结构提出了现有算法是否适用于预测分枝杆菌蛋白内信号肽的问题。在这项工作中,我们在一个阳性验证集上评估了九种信号肽预测算法的性能,该阳性验证集由57种具有经过验证的信号肽和切割位点的蛋白质组成,还有一个阴性集,由61种具有确认注释翻译起始位点的N端序列的蛋白质组成。我们发现SignalP v3.0的隐马尔可夫模型是预测分枝杆菌蛋白中信号肽存在的最佳算法。它没有预测到假阳性或假阴性,并且在阳性集中的57种蛋白质中的45种中预测到了正确的切割位点。基于这些结果,我们使用SignalP v3.0的隐马尔可夫模型分析了10个可用的分枝杆菌物种注释蛋白质组,包括来自威康信托桑格研究所和J.克雷格·文特尔研究所(JCVI)的结核分枝杆菌H37Rv注释。当在预测含有信号肽的蛋白质中排除具有跨膜区域的蛋白质时,我们发现蛋白质组中7.8%至10.5%的蛋白质为假定的分泌蛋白。有趣的是,我们观察到桑格研究所和JCVI之间预测蛋白质百分比存在一致差异。我们确定了预测结核分枝杆菌信号肽酶I加工蛋白的最有价值算法,并使用该算法估计有可能通过该途径输出的分枝杆菌蛋白数量。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/348c/2885676/5621124ea581/2375fig1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验