Assessment of molecular dynamics time series descriptors in protein-ligand affinity prediction.

Pawel Siedlecki,Jakub Poziemski,Artur Yurkevych

DOI: https://doi.org/10.26434/chemrxiv-2024-dxv36

2024-05-22

Abstract:The advancement of computational methods in drug discovery, particularly through the use of machine learning (ML) and deep learning (DL), has significantly enhanced the precision of binding affinity predictions. Despite progress in computer-aided drug discovery (CADD) accurate prediction of binding affinity remains a challenge due to the complex, non-linear character of molecular interactions. Generalizability continues to limit these models, with performance discrepancies noted between training datasets and external test conditions. This study explores the integration of molecular dynamics (MD) simulations with ML to assess its predictive performance and limitations. In particular MD simulations offer a dynamic perspective by depicting the temporal interactions within protein-ligand complexes, potentially bringing additional information for affinity and specificity estimates. By generating and analyzing over 800 unique protein-ligand MD simulations, we evaluate the utility of MD-derived descriptors based on time series in enhancing predictive accuracies. The findings suggest specific and generalizable features derived from MD data and propose approaches to augment the current in silico affinity prediction methods.

Chemistry

What problem does this paper attempt to address?

This paper mainly discusses the application of molecular dynamics time series descriptors in protein-ligand binding affinity prediction. The study points out that although computer-aided drug discovery (CADD) methods, especially machine learning (ML) and deep learning (DL), have improved the accuracy of binding affinity prediction, accurate prediction is still a challenge due to the complexity and nonlinearity of molecular interactions. The current models show variations in performance between training datasets and external test conditions, which limits their generalization ability. To overcome this problem, researchers generated over 800 unique protein-ligand dynamic data through molecular dynamics (MD) simulations and analyzed the time series features extracted from these data to evaluate their potential for enhancing prediction accuracy. MD simulations provide a dynamic perspective of protein-ligand complexes over time, contributing to the estimation of affinity and specificity. Through the analysis of a large amount of MD simulation data, the study found that MD-derived features can improve predictive performance and proposed a method to enhance current computer-aided affinity prediction methods. However, the study also indicated that the use of MD data may be target-specific and influenced by the ratio of noise to signal, such as simulation frame number and MD simulation length. In summary, the paper attempts to address how to use the nonlinear features in MD data to enhance the accuracy of binding affinity prediction and whether these features can identify new features that contribute to prediction. Through large-scale experiments and analysis, the researchers proposed a strategy to optimize the performance of machine learning models by carefully selecting and filtering features in MD data, thereby improving affinity prediction in the drug discovery process.

Assessment of molecular dynamics time series descriptors in protein-ligand affinity prediction.

Can molecular dynamics simulations improve predictions of protein-ligand binding affinity with machine learning?

Predicting the Protein-Ligand Affinity from Molecular Dynamics Trajectories

Binding Affinity Prediction with 3D Machine Learning: Training Data and Challenging External Testing

Development and evaluation of a deep learning model for protein-ligand binding affinity prediction

Dowker complex based machine learning (DCML) models for protein-ligand binding affinity prediction

On Machine Learning Approaches for Protein-Ligand Binding Affinity Prediction

From Static to Dynamic Structures: Improving Binding Affinity Prediction with Graph-Based Deep Learning

A Multi-Grained Symmetric Differential Equation Model for Learning Protein-Ligand Binding Dynamics

MDFit: automated molecular simulations workflow enables high throughput assessment of ligands-protein dynamics

From machine learning to deep learning: Advances in scoring functions for protein–ligand docking

Molecular dynamics simulations and novel drug discovery

DLSCORE: A Deep Learning Model for Predicting Protein-Ligand Binding Affinities

From Static to Dynamic Structures: Improving Binding Affinity Prediction with Graph‐Based Deep Learning

Synergizing Chemical Structures and Bioassay Descriptions for Enhanced Molecular Property Prediction in Drug Discovery

Can Machine Learning Consistently Improve the Scoring Power of Classical Scoring Functions? Insights into the Role of Machine Learning in Scoring Functions.

Ligand-induced protein dynamics differences correlate with protein-ligand binding affinities: An unsupervised deep learning approach

Modern machine‐learning for binding affinity estimation of protein–ligand complexes: Progress, opportunities, and challenges

Machine learning accelerates MD-based binding pose prediction between ligands and proteins

A high quality, industrial data set for binding affinity prediction: performance comparison in different early drug discovery scenarios

Pre-training Protein Models with Molecular Dynamics Simulations for Drug Binding