How to make machine learning scoring functions competitive with FEP

Philip Biggin,Matthew Warren,ísak Valsson,Charlotte Deane,Aniket Magarkar,Garrett Morris

DOI: https://doi.org/10.26434/chemrxiv-2024-bth5z

2024-06-24

Abstract:Machine learning offers a promising approach for fast and accurate binding affin- ity predictions. However, current models often fail to generalise beyond their training data and are not robustly evaluated on a diverse range of benchmarks, limiting their application in drug discovery projects. In this work, we address these issues by intro- ducing a novel graph neural network model called AEV-PLIG (Atomic Environment Vector - Protein Ligand Interaction Graph), which encodes protein-ligand interactions via atomic environment vectors to improve generalisation. We evaluate our model on improved benchmarks, including our new out-of-distribution test set we call OOD Test, and two alternative benchmark systems used for free energy perturbation (FEP) calculations, and highlight competitive performance of AEV-PLIG across the board. Moreover, we demonstrate how augmented data can be leveraged to enhance predic- tion accuracy, and how enriching the training data with three complexes from a con- generic series of ligands binding to a target of interest improves performance further. Altogether, we show that these strategies improve the applicability of machine learn- ing scoring functions and enable state-of-the-art performance nearing the accuracy of physics-based simulation methods—but at a fraction of their computational cost. This practical approach extends the predictive capabilities of machine learning for molecular discovery, paving the way for its broader use in computer-aided drug design.

Chemistry

What problem does this paper attempt to address?

### Problems the Paper Attempts to Solve This paper aims to address two major issues encountered in machine learning when predicting protein-ligand binding affinity: 1. **Lack of Generalization**: Current machine learning models often fail to generalize effectively beyond their training data, which limits their application in drug discovery projects. 2. **Insufficient Evaluation**: Existing models perform poorly in diverse benchmark tests and lack robust evaluation on different distribution data. To tackle these problems, the authors introduce a new graph neural network model—AEV-PLIG (Atomic Environment Vector - Protein Ligand Interaction Graph), and evaluate its performance through an improved benchmark test set. Specifically, the authors constructed a new "OOD Test" (Out-of-Distribution Test) benchmark set to penalize models for memorizing ligands or proteins, ensuring that the models can generalize to unseen data. Additionally, they explored the use of augmented data to improve prediction accuracy, particularly in drug discovery-related benchmarks. Through these strategies, the authors demonstrate that the AEV-PLIG model can achieve performance in predicting binding affinity close to the accuracy of physical simulation methods, but with significantly lower computational costs.

How to make machine learning scoring functions competitive with FEP

From P100 to P100': A new citation‐rank approach

Machine Learning Scoring Functions for Drug Discoveries from Experimental and Computer-Generated Protein-Ligand Structures: Towards Per-Target Scoring Functions

A Generalized Protein-Ligand Scoring Framework with Balanced Scoring, Docking, Ranking and Screening Powers.

Development and evaluation of a deep learning model for protein-ligand binding affinity prediction

A machine learning approach to predicting protein–ligand binding affinity with applications to molecular docking

On Machine Learning Approaches for Protein-Ligand Binding Affinity Prediction

FEP Augmentation as a Means to Solve Data Paucity Problems for Machine Learning in Chemical Biology

Machine Learning Guided AQFEP: A Fast & Efficient Absolute Free Energy Perturbation Solution for Virtual Screening

Modern machine‐learning for binding affinity estimation of protein–ligand complexes: Progress, opportunities, and challenges

Machine Learning Guided AQFEP: A Fast and Efficient Absolute Free Energy Perturbation Solution for Virtual Screening

Can Machine Learning Consistently Improve the Scoring Power of Classical Scoring Functions? Insights into the Role of Machine Learning in Scoring Functions.

Generic protein–ligand interaction scoring by integrating physical prior knowledge and data augmentation modelling

Accurate prediction of protein–ligand interactions by combining physical energy functions and graph-neural networks

Binding Affinity Prediction with 3D Machine Learning: Training Data and Challenging External Testing

Delta Machine Learning to Improve Scoring-Ranking-Screening Performances of Protein–Ligand Scoring Functions

ET‐score: Improving Protein‐ligand Binding Affinity Prediction Based on Distance‐weighted Interatomic Contact Features Using Extremely Randomized Trees Algorithm

Scoring Functions for Protein-Ligand Binding Affinity Prediction Using Structure-based Deep Learning: A Review

PIGNet2: A Versatile Deep Learning-based Protein-Ligand Interaction Prediction Model for Binding Affinity Scoring and Virtual Screening

Improving Structure-Based Virtual Screening Performance Via Learning from Scoring Function Components

Does Machine Learning Learn the Physics for Pose Ranking of Fragment-Sized Ligands? A Comparison between Machine Learning and Physics-based Methods