AUTODIFF: Autoregressive Diffusion Modeling for Structure-based Drug Design

Xinze Li,Penglei Wang,Tianfan Fu,Wenhao Gao,Chengtao Li,Leilei Shi,Junhong Liu
2024-04-03
Abstract:Structure-based drug design (SBDD), which aims to generate molecules that can bind tightly to the target protein, is an essential problem in drug discovery, and previous approaches have achieved initial success. However, most existing methods still suffer from invalid local structure or unrealistic conformation issues, which are mainly due to the poor leaning of bond angles or torsional angles. To alleviate these problems, we propose AUTODIFF, a diffusion-based fragment-wise autoregressive generation model. Specifically, we design a novel molecule assembly strategy named conformal motif that preserves the conformation of local structures of molecules first, then we encode the interaction of the protein-ligand complex with an SE(3)-equivariant convolutional network and generate molecules motif-by-motif with diffusion modeling. In addition, we also improve the evaluation framework of SBDD by constraining the molecular weights of the generated molecules in the same range, together with some new metrics, which make the evaluation more fair and practical. Extensive experiments on CrossDocked2020 demonstrate that our approach outperforms the existing models in generating realistic molecules with valid structures and conformations while maintaining high binding affinity.
Machine Learning
What problem does this paper attempt to address?
The problem addressed in this paper is how to generate molecules with reasonable structures that bind tightly to the target protein in the field of drug design. Existing methods suffer from issues such as ineffective local structures or unrealistic conformations, mainly due to poor learning of bond angles or torsion angles. To address this, the paper proposes a diffusion model called AUTODIFF, which is based on the autoregressive generation method of conformal primitives. It first preserves the local structural information of the molecule and then encodes the interaction between protein-ligand complexes using SE(3) equivariant convolutional networks. The molecule is generated by diffusing according to the primitives. Furthermore, the paper improves the evaluation framework for fairer and more practical assessments. Experimental results show that AUTODIFF can generate molecules with effective structures and conformations while retaining high binding affinity.