Suppr超能文献

DeepLoc 2.1:使用蛋白质语言模型进行多标签膜蛋白类型预测。

DeepLoc 2.1: multi-label membrane protein type prediction using protein language models.

机构信息

Section for Bioinformatics, Department of Health Technology, Technical University of Denmark, 2800 Kongens Lyngby, Denmark.

Bioinformatics Centre, Department of Biology, University of Copenhagen, 2200 Copenhagen, Denmark.

出版信息

Nucleic Acids Res. 2024 Jul 5;52(W1):W215-W220. doi: 10.1093/nar/gkae237.

Abstract

DeepLoc 2.0 is a popular web server for the prediction of protein subcellular localization and sorting signals. Here, we introduce DeepLoc 2.1, which additionally classifies the input proteins into the membrane protein types Transmembrane, Peripheral, Lipid-anchored and Soluble. Leveraging pre-trained transformer-based protein language models, the server utilizes a three-stage architecture for sequence-based, multi-label predictions. Comparative evaluations with other established tools on a test set of 4933 eukaryotic protein sequences, constructed following stringent homology partitioning, demonstrate state-of-the-art performance. Notably, DeepLoc 2.1 outperforms existing models, with the larger ProtT5 model exhibiting a marginal advantage over the ESM-1B model. The web server is available at https://services.healthtech.dtu.dk/services/DeepLoc-2.1.

摘要

DeepLoc 2.0 是一个用于预测蛋白质亚细胞定位和分拣信号的流行网络服务器。在这里,我们介绍 DeepLoc 2.1,它还可以将输入的蛋白质分类为膜蛋白类型:跨膜、外周、脂锚定和可溶性。该服务器利用基于预训练的转换器的蛋白质语言模型,采用基于序列的三阶段架构进行多标签预测。在一个经过严格同源分区构建的 4933 个真核蛋白质序列测试集上,与其他已建立的工具进行的比较评估表明,该服务器具有最先进的性能。值得注意的是,DeepLoc 2.1 优于现有的模型,较大的 ProtT5 模型比 ESM-1B 模型表现出略微的优势。该网络服务器可在 https://services.healthtech.dtu.dk/services/DeepLoc-2.1 上获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f549/11223819/6818fd6766a7/gkae237figgra1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验