Abstract:Message-passing neural networks (MPNNs) on molecular graphs generate continuous and differentiable encodings of small molecules with state-of-the-art performance on protein-ligand complex scoring tasks. Here, we describe the Protein-Graph Network (PGN) package, an open-source toolkit that constructs ligand-receptor graphs based on atom proximity and allows users to rapidly apply and evaluate MPNN architectures for a broad range of tasks. We demonstrate the utility of PGN by introducing benchmarks for affinity and docking score prediction tasks. Graph networks generalize better than fingerprint-based models and perform strongly for the docking score prediction task. Overall, MPNNs with Proximity Graph data structures augment the prediction of ligand-receptor complex properties when ligand-receptor data are available.

What problem does this paper attempt to address?

The main problem that this paper attempts to solve is to improve the accuracy of predicting the binding affinity of protein - ligand complexes. Specifically, the authors introduce a new method named Proximity Graph Networks (PGNs), which is an open - source toolkit based on Message Passing Neural Networks (MPNNs) and is used to construct ligand - receptor graphs based on atomic proximity. This method aims to significantly improve model performance by allowing information to be passed between ligand and protein atoms during the learning process. The paper also shows the applicability of different MPNN architectures in different tasks and emphasizes the importance of the modular framework for evaluating MPNN architectures. ### Main Research Questions 1. **Improving the accuracy of binding affinity prediction**: - The paper explores how to use PGNs to improve the prediction of the binding affinity of protein - ligand complexes. By introducing a graph structure based on atomic proximity, the model can more effectively capture the interactions between ligands and receptors, thereby improving the accuracy of prediction. 2. **Evaluating the performance of different MPNN architectures**: - The authors test multiple MPNN architectures (such as PFP, DMPNN, GGNET, etc.) and evaluate them on different datasets to determine which architecture performs best on a specific task. The results show that different MPNN architectures exhibit different performance advantages in different tasks. 3. **Verifying the generalization ability of the model**: - The paper pays special attention to the generalization ability of the model on unseen receptors. The authors use the PDBbind dataset and the D4 diverse docking dataset to evaluate the generalization performance of the model. The results show that PGNs perform well in these tasks, especially on the D4 diverse docking dataset. ### Methods and Experiments - **Datasets**: - PDBbind 2019 Refined Set: It contains 4,852 high - quality ligand - receptor complexes. - PDBbind 2019 General Set: It contains 17,679 ligand - receptor complexes. - D4 Diverse Docking Set: It contains 86,452 ligands docked onto the dopamine D4 receptor. - D4 Experimental Dataset: It contains 510 ligands with experimental binding data. - **Model Evaluation**: - Use Root Mean Square Error (RMSE) and Pearson Correlation Coefficient (PCC) as evaluation metrics. - Select the best model configuration through cross - validation and hyperparameter optimization. ### Results - **PDBbind Datasets**: - The PFP encoder significantly outperforms the baseline model in the protein splitting task, indicating that the graph model has an advantage in generalization performance. - The DMPNN architecture performs well on the PDBbind General dataset, especially in the random splitting task. - **D4 Diverse Datasets**: - All graph models significantly outperform the baseline model, and in particular, the DMPNN model performs best. - The performance of the model on the similarity - split dataset is comparable to that on the random - split dataset, which may be because the dataset itself is already very diverse. ### Conclusions - **Main Contributions**: - Introduced PGNs, an open - source toolkit based on Message Passing Neural Networks, for constructing ligand - receptor graphs. - Demonstrated the performance differences of different MPNN architectures in different tasks and emphasized the importance of the modular framework. - Verified the effectiveness and generalization ability of PGNs in predicting ligand - receptor binding affinity. - **Future Work**: - Explore more diverse graph convolution methods, such as deep tensor networks. - Research different data augmentation techniques to improve model performance in low - data situations. - Apply PGNs to fields such as molecular dynamics simulation and virtual screening. Through these studies, the authors hope to provide a powerful tool for the fields of drug design and computational chemistry to more accurately predict the properties of ligand - receptor complexes.

Proximity Graph Networks: Predicting Ligand Affinity with Message Passing Neural Networks

Encoding Protein-Ligand Interactions: Binding Affinity Prediction with Multigraph-based Modeling and Graph Convolutional Network

SS-GNN: A Simple-Structured Graph Neural Network for Affinity Prediction

Complex machine learning model needs complex testing: Examining predictability of molecular binding affinity by a graph neural network

ProAffinity-GNN: A Novel Approach to Structure-based Protein-Protein Binding Affinity Prediction via a Curated Dataset and Graph Neural Networks

[Pre-decision algorithms for the induction of ovulation in fertilization in vitro].

Protein-Ligand Interaction Graphs: Learning from Ligand-Shaped 3D Interaction Graphs to Improve Binding Affinity Prediction

Structure-aware Interactive Graph Neural Networks for the Prediction of Protein-Ligand Binding Affinity

Efficient Integration of Molecular Representation and Message-Passing Neural Networks for Predicting Small Molecule Drug-like Properties

A Cascade Graph Convolutional Network for Predicting Protein–Ligand Binding Affinity

PLANET: A Multi-objective Graph Neural Network Model for Protein–Ligand Binding Affinity Prediction

G- PLIP: Knowledge graph neural network for structure-free protein-ligand bioactivity prediction

G- : Knowledge graph neural network for structure-free protein-ligand bioactivity prediction

GIANT: Protein-Ligand Binding Affinity Prediction via Geometry-aware Interactive Graph Neural Network

OnionNet: a Multiple-Layer Intermolecular-Contact-Based Convolutional Neural Network for Protein–Ligand Binding Affinity Prediction

Knowledge-Embedded Message-Passing Neural Networks: Improving Molecular Property Prediction with Human Knowledge

Ligand binding affinity prediction with fusion of graph neural networks and 3D structure-based complex graph

Protein-ligand Binding Affinity Prediction Model Based on Graph Attention Network

ContactNet: Geometric-Based Deep Learning Model for Predicting Protein-Protein Interactions

GMPP-NN: a deep learning architecture for graph molecular property prediction