VN-EGNN: E(3)-Equivariant Graph Neural Networks with Virtual Nodes Enhance Protein Binding Site Identification

Florian Sestak,Lisa Schneckenreiter,Johannes Brandstetter,Sepp Hochreiter,Andreas Mayr,Günter Klambauer

2024-04-11

Abstract:Being able to identify regions within or around proteins, to which ligands can potentially bind, is an essential step to develop new drugs. Binding site identification methods can now profit from the availability of large amounts of 3D structures in protein structure databases or from AlphaFold predictions. Current binding site identification methods heavily rely on graph neural networks (GNNs), usually designed to output E(3)-equivariant predictions. Such methods turned out to be very beneficial for physics-related tasks like binding energy or motion trajectory prediction. However, the performance of GNNs at binding site identification is still limited potentially due to the lack of dedicated nodes that model hidden geometric entities, such as binding pockets. In this work, we extend E(n)-Equivariant Graph Neural Networks (EGNNs) by adding virtual nodes and applying an extended message passing scheme. The virtual nodes in these graphs are dedicated quantities to learn representations of binding sites, which leads to improved predictive performance. In our experiments, we show that our proposed method VN-EGNN sets a new state-of-the-art at locating binding site centers on COACH420, HOLO4K and PDBbind2020.

Machine Learning,Artificial Intelligence,Biomolecules

What problem does this paper attempt to address?

This paper focuses on the problem of protein binding site recognition, which is a critical computational challenge in drug discovery. With the development of technologies such as AlphaFold, the availability of a large amount of protein 3D structure data provides new opportunities for this problem. Current methods mainly rely on graph neural networks (GNNs), especially E(3)-equivariant GNNs, which perform well in predicting physical tasks such as binding energy or motion trajectory. However, the performance of GNNs in binding site recognition is still limited, possibly due to the lack of dedicated nodes that can simulate hidden geometric entities such as binding pockets. The paper proposes a new method called VN-EGNN (Virtual Node E(3)-equivariant Graph Neural Network), which enhances GNNs by adding virtual nodes and adopting an extended message-passing scheme. Virtual nodes are designed to learn the representation of binding sites, thereby improving prediction performance. Experiments show that VN-EGNN sets new state-of-the-art standards on datasets such as COACH420, HOLO4K, and PDBbind2020, accurately locating the center of binding sites. The paper also discusses the limitations of GNNs, including limited expressive power, over-smoothing, and over-compression, and points out that virtual nodes can alleviate these issues. By updating the coordinates of virtual nodes, VN-EGNN can predict the center of binding sites, forming a useful neural representation of the binding sites. Finally, the paper proposes a new E(3)-equivariant GNN architecture incorporating virtual nodes, which is the first of its kind that does not rely on prior knowledge, and evaluates its performance on various benchmark datasets.

VN-EGNN: E(3)-Equivariant Graph Neural Networks with Virtual Nodes Enhance Protein Binding Site Identification

Equivariant Line Graph Neural Network for Protein-Ligand Binding Affinity Prediction

EquiPocket: an E(3)-Equivariant Geometric Graph Neural Network for Ligand Binding Site Prediction

Predicting Protein-Ligand Binding Affinity with Equivariant Line Graph Network

Encoding Protein-Ligand Interactions: Binding Affinity Prediction with Multigraph-based Modeling and Graph Convolutional Network

E(Q)AGNN-PPIS: Attention Enhanced Equivariant Graph Neural Network for Protein-Protein Interaction Site Prediction

Structure-aware Interactive Graph Neural Networks for the Prediction of Protein-Ligand Binding Affinity

GIANT: Protein-Ligand Binding Affinity Prediction via Geometry-aware Interactive Graph Neural Network

[Pre-decision algorithms for the induction of ovulation in fertilization in vitro].

DeepBindGCN: Integrating Molecular Vector Representation with Graph Convolutional Neural Networks for Protein–Ligand Interaction Prediction

EGPDI: identifying protein–DNA binding sites based on multi-view graph embedding fusion

G- : Knowledge graph neural network for structure-free protein-ligand bioactivity prediction

Purification and characterization of glycyrrhetic acid mono-glucuronide beta-D-glucuronidase in Eubacterium sp. GLH.

DENVIS: scalable and high-throughput virtual screening using graph neural networks with atomic and surface protein pocket features

G- PLIP: Knowledge graph neural network for structure-free protein-ligand bioactivity prediction

GraphBind: protein structural context embedded rules learned by hierarchical graph neural networks for recognizing nucleic-acid-binding residues

Guidelines from the British Hypertension Society: Numbers are missing

A Point Cloud Graph Neural Network for Protein–Ligand Binding Site Prediction

Ligand binding affinity prediction with fusion of graph neural networks and 3D structure-based complex graph

Protein-Ligand Interaction Graphs: Learning from Ligand-Shaped 3D Interaction Graphs to Improve Binding Affinity Prediction

EquiBind: Geometric Deep Learning for Drug Binding Structure Prediction