Binding-Adaptive Diffusion Models for Structure-Based Drug Design

Zhilin Huang,Ling Yang,Zaixi Zhang,Xiangxin Zhou,Yu Bao,Xiawu Zheng,Yuwei Yang,Yu Wang,Wenming Yang
2024-01-15
Abstract:Structure-based drug design (SBDD) aims to generate 3D ligand molecules that bind to specific protein targets. Existing 3D deep generative models including diffusion models have shown great promise for SBDD. However, it is complex to capture the essential protein-ligand interactions exactly in 3D space for molecular generation. To address this problem, we propose a novel framework, namely Binding-Adaptive Diffusion Models (BindDM). In BindDM, we adaptively extract subcomplex, the essential part of binding sites responsible for protein-ligand interactions. Then the selected protein-ligand subcomplex is processed with SE(3)-equivariant neural networks, and transmitted back to each atom of the complex for augmenting the target-aware 3D molecule diffusion generation with binding interaction information. We iterate this hierarchical complex-subcomplex process with cross-hierarchy interaction node for adequately fusing global binding context between the complex and its corresponding subcomplex. Empirical studies on the CrossDocked2020 dataset show BindDM can generate molecules with more realistic 3D structures and higher binding affinities towards the protein targets, with up to -5.92 Avg. Vina Score, while maintaining proper molecular properties. Our code is available at
Biomolecules,Machine Learning
What problem does this paper attempt to address?
### What problems does this paper attempt to solve? This paper aims to solve the key challenges in Structure - Based Drug Design (SBDD). Specifically, the goal of SBDD is to generate 3D ligand molecules that can bind to specific protein targets and regulate their functions. Although existing 3D deep - generation models (such as diffusion models) have shown great potential in SBDD tasks, accurately capturing the complexity of protein - ligand interactions in 3D space remains a difficult problem. #### Main problems 1. **Complexity of protein - ligand interactions**: It is very difficult to accurately capture the essential interactions between proteins and ligands in 3D space. 2. **Limitations of existing methods**: - **Autoregressive models (ARMs)**: They are prone to error accumulation and it is difficult to find the optimal generation sequence. - **Existing diffusion models**: Although they perform excellently, they pay insufficient attention to the sub - structures of specific binding sites in protein - ligand complexes, and these sub - structures are crucial for generating high - affinity molecules. #### Proposed solutions To solve these problems, the authors propose a new framework - **Binding - Adaptive Diffusion Models (BINDDM)**. The main innovations of BINDDM include: 1. **Adaptive extraction of binding sub - complexes**: By learning an adjustable structural pooling method, directly extract the key sub - complexes responsible for binding from protein - ligand complexes. 2. **SE(3) - equivariant neural network processing**: Use SE(3) - equivariant neural networks to process the selected sub - complexes and pass the information back to the entire complex to enhance target - aware 3D molecular diffusion generation. 3. **Cross - level interaction nodes**: By designing cross - level interaction nodes, iteratively fuse the global binding context to ensure sufficient information exchange between the complex and its sub - complexes. #### Experimental results Experiments show that BINDDM can generate molecules with more realistic 3D structures and higher binding affinities on the CrossDocked2020 dataset while maintaining appropriate molecular properties. For example, BINDDM has an average Vina score of - 5.92, which is significantly better than other benchmark methods. ### Summary BINDDM effectively solves the complexity problem of protein - ligand interactions in SBDD by introducing binding - adaptive sub - complex extraction and cross - level interaction nodes, and improves the stability and binding affinity of generated molecules.