From Static to Dynamic Structures: Improving Binding Affinity Prediction with Graph-Based Deep Learning

Yaosen Min,Ye Wei,Peizhuo Wang,Xiaoting Wang,Han Li,Nian Wu,Stefan Bauer,Shuxin Zheng,Yu Shi,Yingheng Wang,Ji Wu,Dan Zhao,Jianyang Zeng
DOI: https://doi.org/10.1002/advs.202405404
2024-09-02
Abstract:Accurate prediction of protein-ligand binding affinities is an essential challenge in structure-based drug design. Despite recent advances in data-driven methods for affinity prediction, their accuracy is still limited, partially because they only take advantage of static crystal structures while the actual binding affinities are generally determined by the thermodynamic ensembles between proteins and ligands. One effective way to approximate such a thermodynamic ensemble is to use molecular dynamics (MD) simulation. Here, an MD dataset containing 3,218 different protein-ligand complexes is curated, and Dynaformer, a graph-based deep learning model is further developed to predict the binding affinities by learning the geometric characteristics of the protein-ligand interactions from the MD trajectories. In silico experiments demonstrated that the model exhibits state-of-the-art scoring and ranking power on the CASF-2016 benchmark dataset, outperforming the methods hitherto reported. Moreover, in a virtual screening on heat shock protein 90 (HSP90) using Dynaformer, 20 candidates are identified and their binding affinities are further experimentally validated. Dynaformer displayed promising results in virtual drug screening, revealing 12 hit compounds (two are in the submicromolar range), including several novel scaffolds. Overall, these results demonstrated that the approach offer a promising avenue for accelerating the early drug discovery process.
Biomolecules,Machine Learning,Chemical Physics,Quantitative Methods
What problem does this paper attempt to address?
The paper aims to address the problem of protein-ligand binding affinity prediction, which is a key challenge in drug design. Despite recent advances in data-driven approaches, the accuracy of these methods remains limited, partly because they only utilize static crystal structures, whereas actual binding affinity is often determined by the thermodynamic ensemble of protein-ligand interactions. To tackle this issue, the research team developed a large-scale dataset based on molecular dynamics (MD) simulations and proposed a graph neural network model named Dynaformer. Dynaformer predicts binding affinity by learning the geometric features of protein-ligand interactions from MD trajectories. Experimental results show that the model achieved state-of-the-art performance in scoring and ranking on the CASF-2016 benchmark dataset. Additionally, in a virtual screening against Heat Shock Protein 90 (HSP90), Dynaformer successfully identified 20 candidate compounds, and the binding affinities of these compounds were experimentally validated. These results demonstrate the potential of Dynaformer in the early drug discovery process, particularly in accelerating the design of novel drugs.