Suppr超能文献

基于图神经网络的蛋白质-蛋白质相互作用预测。

Prediction of protein-protein interaction using graph neural networks.

机构信息

Department of Computer Science and Engineering, Indian Institute of Technology Patna, Patna, Bihar, 801103, India.

Department of Electrical Engineering, Indian Institute of Technology Jodhpur, Jodhpur, Rajasthan, 342030, India.

出版信息

Sci Rep. 2022 May 19;12(1):8360. doi: 10.1038/s41598-022-12201-9.

Abstract

Proteins are the essential biological macromolecules required to perform nearly all biological processes, and cellular functions. Proteins rarely carry out their tasks in isolation but interact with other proteins (known as protein-protein interaction) present in their surroundings to complete biological activities. The knowledge of protein-protein interactions (PPIs) unravels the cellular behavior and its functionality. The computational methods automate the prediction of PPI and are less expensive than experimental methods in terms of resources and time. So far, most of the works on PPI have mainly focused on sequence information. Here, we use graph convolutional network (GCN) and graph attention network (GAT) to predict the interaction between proteins by utilizing protein's structural information and sequence features. We build the graphs of proteins from their PDB files, which contain 3D coordinates of atoms. The protein graph represents the amino acid network, also known as residue contact network, where each node is a residue. Two nodes are connected if they have a pair of atoms (one from each node) within the threshold distance. To extract the node/residue features, we use the protein language model. The input to the language model is the protein sequence, and the output is the feature vector for each amino acid of the underlying sequence. We validate the predictive capability of the proposed graph-based approach on two PPI datasets: Human and S. cerevisiae. Obtained results demonstrate the effectiveness of the proposed approach as it outperforms the previous leading methods. The source code for training and data to train the model are available at https://github.com/JhaKanchan15/PPI_GNN.git .

摘要

蛋白质是执行几乎所有生物过程和细胞功能所必需的生物大分子。蛋白质很少孤立地执行其任务,而是与周围环境中的其他蛋白质(称为蛋白质-蛋白质相互作用)相互作用,以完成生物活性。蛋白质-蛋白质相互作用(PPI)的知识揭示了细胞行为及其功能。计算方法可以自动预测 PPI,并且在资源和时间方面比实验方法便宜。到目前为止,大多数关于 PPI 的工作主要集中在序列信息上。在这里,我们使用图卷积网络(GCN)和图注意力网络(GAT)通过利用蛋白质的结构信息和序列特征来预测蛋白质之间的相互作用。我们从蛋白质的 PDB 文件构建蛋白质图,其中包含原子的 3D 坐标。蛋白质图表示氨基酸网络,也称为残基接触网络,其中每个节点是一个残基。如果两个节点之间有一对原子(每个节点一个)在阈值距离内,则它们就会连接。为了提取节点/残基特征,我们使用蛋白质语言模型。语言模型的输入是蛋白质序列,输出是基础序列中每个氨基酸的特征向量。我们在两个 PPI 数据集上验证了基于图的方法的预测能力:人类和酿酒酵母。获得的结果表明了所提出的方法的有效性,因为它优于以前的领先方法。训练的源代码和用于训练模型的数据可在 https://github.com/JhaKanchan15/PPI_GNN.git 获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5e64/9120162/0adbfc3970b7/41598_2022_12201_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验