Suppr超能文献

PredLLPS_PSSM:一种基于进化信息和深度神经网络的新型液-液相蛋白质分离识别预测器。

PredLLPS_PSSM: a novel predictor for liquid-liquid protein separation identification based on evolutionary information and a deep neural network.

机构信息

School of Science, Dalian Maritime University, Dalian 116026, China.

School of Bioengineering, Dalian University of Technology, Dalian 116024, China.

出版信息

Brief Bioinform. 2023 Sep 20;24(5). doi: 10.1093/bib/bbad299.

Abstract

The formation of biomolecular condensates by liquid-liquid phase separation (LLPS) has become a universal mechanism for spatiotemporal coordination of biological activities in cells and has been widely observed to directly regulate the key cellular processes involved in cancer cell pathology. However, the complexity of protein sequences and the diversity of conformations are inherently disordered, which poses great challenges for LLPS protein calculations and experimental research. Herein, we proposed a novel predictor named PredLLPS_PSSM for LLPS protein identification based only on sequence evolution information. Because finding real and reliable samples is the cornerstone of building predictors, we collected anew and collated the LLPS proteins from the latest versions of three databases. By comparing the performance of the position-specific score matrix (PSSM) and word embedding, PredLLPS_PSSM combined PSSM-based information and two deep learning frameworks. Independent tests using three existing independent test datasets and two newly constructed independent test datasets demonstrated the superiority of PredLLPS_PSSM compared with state-of-the-art methods. Furthermore, we tested PredLLPS_PSSM on nine experimentally identified LLPS proteins from three insects that were not included in any of the databases. In addition, the powerful Shapley Additive exPlanation algorithm and heatmap were applied to find the most critical amino acids relevant to LLPS.

摘要

液-液相分离(LLPS)形成生物分子凝聚物已成为细胞中生物活性时空协调的通用机制,并广泛观察到其直接调节涉及癌细胞病理学的关键细胞过程。然而,蛋白质序列的复杂性和构象的多样性是固有无序的,这给 LLPS 蛋白的计算和实验研究带来了巨大的挑战。在此,我们提出了一种新的基于序列进化信息的 LLPS 蛋白预测器 PredLLPS_PSSM。因为寻找真实可靠的样本是构建预测器的基石,我们从三个数据库的最新版本中收集并整理了新的 LLPS 蛋白。通过比较位置特异性评分矩阵(PSSM)和词嵌入的性能,PredLLPS_PSSM 结合了基于 PSSM 的信息和两个深度学习框架。使用三个现有的独立测试数据集和两个新构建的独立测试数据集进行的独立测试表明,PredLLPS_PSSM 优于最先进的方法。此外,我们在三种昆虫中测试了九个实验鉴定的 LLPS 蛋白,这些蛋白均未包含在任何数据库中。此外,还应用了强大的 Shapley Additive exPlanation 算法和热图来寻找与 LLPS 最相关的关键氨基酸。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验