DeepLoc 2.1：使用蛋白质语言模型进行多标签膜蛋白类型预测。

DeepLoc 2.1: multi-label membrane protein type prediction using protein language models.

机构信息

Section for Bioinformatics, Department of Health Technology, Technical University of Denmark, 2800 Kongens Lyngby, Denmark.

Bioinformatics Centre, Department of Biology, University of Copenhagen, 2200 Copenhagen, Denmark.

出版信息

Nucleic Acids Res. 2024 Jul 5;52(W1):W215-W220. doi: 10.1093/nar/gkae237.

DOI:10.1093/nar/gkae237

PMID:38587188

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11223819/

Abstract

DeepLoc 2.0 is a popular web server for the prediction of protein subcellular localization and sorting signals. Here, we introduce DeepLoc 2.1, which additionally classifies the input proteins into the membrane protein types Transmembrane, Peripheral, Lipid-anchored and Soluble. Leveraging pre-trained transformer-based protein language models, the server utilizes a three-stage architecture for sequence-based, multi-label predictions. Comparative evaluations with other established tools on a test set of 4933 eukaryotic protein sequences, constructed following stringent homology partitioning, demonstrate state-of-the-art performance. Notably, DeepLoc 2.1 outperforms existing models, with the larger ProtT5 model exhibiting a marginal advantage over the ESM-1B model. The web server is available at https://services.healthtech.dtu.dk/services/DeepLoc-2.1.

摘要

DeepLoc 2.0 是一个用于预测蛋白质亚细胞定位和分拣信号的流行网络服务器。在这里，我们介绍 DeepLoc 2.1，它还可以将输入的蛋白质分类为膜蛋白类型：跨膜、外周、脂锚定和可溶性。该服务器利用基于预训练的转换器的蛋白质语言模型，采用基于序列的三阶段架构进行多标签预测。在一个经过严格同源分区构建的 4933 个真核蛋白质序列测试集上，与其他已建立的工具进行的比较评估表明，该服务器具有最先进的性能。值得注意的是，DeepLoc 2.1 优于现有的模型，较大的 ProtT5 模型比 ESM-1B 模型表现出略微的优势。该网络服务器可在 https://services.healthtech.dtu.dk/services/DeepLoc-2.1 上获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f549/11223819/6818fd6766a7/gkae237figgra1.jpg

相似文献

DeepLoc 2.1: multi-label membrane protein type prediction using protein language models.

Nucleic Acids Res. 2024 Jul 5;52(W1):W215-W220. doi: 10.1093/nar/gkae237.

DeepLoc 2.0: multi-label subcellular localization prediction using protein language models.

Nucleic Acids Res. 2022 Jul 5;50(W1):W228-W234. doi: 10.1093/nar/gkac278.

DeepLoc: prediction of protein subcellular localization using deep learning.

Bioinformatics. 2017 Nov 1;33(21):3387-3395. doi: 10.1093/bioinformatics/btx431.

SignalP: The Evolution of a Web Server.

Methods Mol Biol. 2024;2836:331-367. doi: 10.1007/978-1-0716-4007-4_17.

The TOPCONS web server for consensus prediction of membrane protein topology and signal peptides.

Nucleic Acids Res. 2015 Jul 1;43(W1):W401-7. doi: 10.1093/nar/gkv485. Epub 2015 May 12.

PROTEUS2: a web server for comprehensive protein structure prediction and structure-based annotation.

Nucleic Acids Res. 2008 Jul 1;36(Web Server issue):W202-9. doi: 10.1093/nar/gkn255. Epub 2008 May 15.

TRAMPLE: the transmembrane protein labelling environment.

Nucleic Acids Res. 2005 Jul 1;33(Web Server issue):W198-201. doi: 10.1093/nar/gki440.

TMpro web server and web service: transmembrane helix prediction through amino acid property analysis.

Bioinformatics. 2007 Oct 15;23(20):2795-6. doi: 10.1093/bioinformatics/btm398. Epub 2007 Aug 27.

MemType-2L: a web server for predicting membrane proteins and their types by incorporating evolution information through Pse-PSSM.

Biochem Biophys Res Commun. 2007 Aug 24;360(2):339-45. doi: 10.1016/j.bbrc.2007.06.027. Epub 2007 Jun 15.

Advantages of combined transmembrane topology and signal peptide prediction--the Phobius web server.

Nucleic Acids Res. 2007 Jul;35(Web Server issue):W429-32. doi: 10.1093/nar/gkm256. Epub 2007 May 5.

引用本文的文献

Naturally impaired side-chain shortening of aromatic 3-ketoacyl-CoAs reveals the biosynthetic pathway of plant acetophenones.

Nat Plants. 2025 Sep 5. doi: 10.1038/s41477-025-02082-x.

Soybean Cyst Nematode-Resistant Protein AAT Affects Amino Acid Homeostasis and Betalain Accumulation.

Plant Direct. 2025 Aug 26;9(8):e70098. doi: 10.1002/pld3.70098. eCollection 2025 Aug.

On the impact of local protein structure features on prediction of major histocompatibility complex class I and II antigen presentation.

Brief Bioinform. 2025 Jul 2;26(4). doi: 10.1093/bib/bbaf402.

Immunoinformatic design of chimeric multiepitope vaccine for the prevention of human metapneumovirus (hMPV).

BMC Infect Dis. 2025 Jul 30;25(1):964. doi: 10.1186/s12879-025-11339-x.

Patchy Phylogenetic Distribution and Poor Translational Adaptation of a Nested ORF in the Mammalian Mitochondrial Gene.

Genes (Basel). 2025 Jul 17;16(7):833. doi: 10.3390/genes16070833.

Comparative Quantitative Proteomic Analysis of High and Low Toxin-Producing Strains Reveals Differences in Polyketide Synthase Abundance and Redox Status of the Proteome.

Mar Drugs. 2025 Jul 17;23(7):291. doi: 10.3390/md23070291.

Enrichable cross-linkers for mapping direct protein interactions.

Genome Biol. 2025 Jul 15;26(1):205. doi: 10.1186/s13059-025-03669-5.

Comparative Genomics of Chloropicon primus and Chloropicon roscoffensis Provide Insights into the Evolutionary Dynamics and Ecological Success of These Tiny Green Algae in Marine Environments.

Genome Biol Evol. 2025 Jul 3;17(7). doi: 10.1093/gbe/evaf140.

Practical Applications of Language Models in Protein Sorting Prediction: SignalP 6.0, DeepLoc 2.1, and DeepLocPro 1.0.

Methods Mol Biol. 2025;2941:153-175. doi: 10.1007/978-1-0716-4623-6_10.

Functional Characterization of Squalene Epoxidases from .

Plants (Basel). 2025 Jun 6;14(12):1740. doi: 10.3390/plants14121740.

本文引用的文献

GraphPart: homology partitioning for biological sequence analysis.

NAR Genom Bioinform. 2023 Oct 16;5(4):lqad088. doi: 10.1093/nargab/lqad088. eCollection 2023 Dec.

Evolutionary-scale prediction of atomic-level protein structure with a language model.

Science. 2023 Mar 17;379(6637):1123-1130. doi: 10.1126/science.ade2574. Epub 2023 Mar 16.

UniProt: the Universal Protein Knowledgebase in 2023.

Nucleic Acids Res. 2023 Jan 6;51(D1):D523-D531. doi: 10.1093/nar/gkac1052.

DeepLoc 2.0: multi-label subcellular localization prediction using protein language models.

Nucleic Acids Res. 2022 Jul 5;50(W1):W228-W234. doi: 10.1093/nar/gkac278.

SignalP 6.0 predicts all five types of signal peptides using protein language models.

Nat Biotechnol. 2022 Jul;40(7):1023-1025. doi: 10.1038/s41587-021-01156-3. Epub 2022 Jan 3.

ProtTrans: Toward Understanding the Language of Life Through Self-Supervised Learning.

IEEE Trans Pattern Anal Mach Intell. 2022 Oct;44(10):7112-7127. doi: 10.1109/TPAMI.2021.3095381. Epub 2022 Sep 14.

Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences.

Proc Natl Acad Sci U S A. 2021 Apr 13;118(15). doi: 10.1073/pnas.2016239118.

The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation.

BMC Genomics. 2020 Jan 2;21(1):6. doi: 10.1186/s12864-019-6413-7.

Detecting sequence signals in targeting peptides using deep learning.

Life Sci Alliance. 2019 Sep 30;2(5). doi: 10.26508/lsa.201900429. Print 2019 Oct.

Focal Loss for Dense Object Detection.

IEEE Trans Pattern Anal Mach Intell. 2020 Feb;42(2):318-327. doi: 10.1109/TPAMI.2018.2858826. Epub 2018 Jul 23.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

DeepLoc 2.1：使用蛋白质语言模型进行多标签膜蛋白类型预测。

DeepLoc 2.1: multi-label membrane protein type prediction using protein language models.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献