MEG-PPIS: a fast protein-protein interaction site prediction method based on multi-scale graph information and equivariant graph neural network

Hongzhen Ding,Xue Li,Peifu Han,Xu Tian,Fengrui Jing,Shuang Wang,Tao Song,Hanjiao Fu,Na Kang
DOI: https://doi.org/10.1093/bioinformatics/btae269
IF: 5.8
2024-04-18
Bioinformatics
Abstract:Abstract Motivation Protein-protein interaction sites (PPIS) are crucial for deciphering protein action mechanisms and related medical research, which is the key issue in protein action research. Recent studies have shown that graph neural networks have achieved outstanding performance in predicting PPIS. However, these studies often neglect the modeling of information at different scales in the graph and the symmetry of protein molecules within three-dimensional space. Results In response to this gap, this paper proposes the MEG-PPIS approach, a PPIS prediction method based on multi-scale graph information and E(n) equivariant graph neural network (EGNN). There are two channels in MEG-PPIS: the original graph and the subgraph obtained by graph pooling. The model can iteratively update the features of the original graph and subgraph through the weight-sharing EGNN. Subsequently, the max-pooling operation aggregates the updated features of the original graph and subgraph. Ultimately, the model feeds node features into the prediction layer to obtain prediction results. Comparative assessments against other methods on benchmark datasets reveal that MEG-PPIS achieves optimal performance across all evaluation metrics and gets the fastest runtime. Furthermore, specific case studies demonstrate that our method can predict more true positive and true negative sites than the current best method, proving that our model achieves better performance in the PPIS prediction task. Availability and Implementation The data and code are available at https://github.com/dhz234/MEG-PPIS.git.
biochemical research methods,biotechnology & applied microbiology,mathematical & computational biology
What problem does this paper attempt to address?
This paper aims to solve the problem of protein - protein interaction site (PPIS) prediction. Specifically, although the existing graph neural network (GNN) - based methods have achieved remarkable results in predicting PPIS, these methods often overlook information modeling at different scales in protein graphs and the symmetry of protein molecules in three - dimensional space. These problems limit the performance and accuracy of existing methods. To solve the above problems, this paper proposes a new method named MEG - PPIS. Based on multi - scale graph information and E(n) equivariant graph neural network (EGNN), MEG - PPIS improves the performance of PPIS prediction through the following improvements: 1. **Multi - scale information modeling**: MEG - PPIS uses graph pooling technology to divide the original protein graph into sub - graphs, thereby modeling the structural characteristics of proteins at different scales. This helps to capture patterns at different scales and enhance the learning ability of the model. 2. **Equivariant graph neural network (EGNN)**: EGNN can maintain the symmetries such as rotation, reflection and translation of protein molecules in three - dimensional space, thereby modeling the structural information of proteins more accurately. 3. **Weight sharing strategy**: During the feature update process, MEG - PPIS shares network weights between the original graph and sub - graphs, which enables the model to consider neighbor messages in two different ranges when aggregating node features, further improving the performance of the model. 4. **Residual connection**: In order to alleviate the over - smoothing problem in graph neural networks, MEG - PPIS introduces residual connection in each graph node feature update layer, which helps the model to still learn effectively in deep networks. The experimental results show that MEG - PPIS outperforms the existing state - of - the - art methods in multiple evaluation metrics and also has a significant improvement in prediction efficiency. In addition, through ablation experiments on different feature combinations, the study found that introducing protein structure information features can significantly improve the performance of the PPIS prediction model.