Structure-based drug design by denoising voxel grids

Pedro O. Pinheiro,Arian Jamasb,Omar Mahmood,Vishnu Sresht,Saeed Saremi
2024-07-02
Abstract:We present VoxBind, a new score-based generative model for 3D molecules conditioned on protein structures. Our approach represents molecules as 3D atomic density grids and leverages a 3D voxel-denoising network for learning and generation. We extend the neural empirical Bayes formalism (Saremi & Hyvarinen, 2019) to the conditional setting and generate structure-conditioned molecules with a two-step procedure: (i) sample noisy molecules from the Gaussian-smoothed conditional distribution with underdamped Langevin MCMC using the learned score function and (ii) estimate clean molecules from the noisy samples with single-step denoising. Compared to the current state of the art, our model is simpler to train, significantly faster to sample from, and achieves better results on extensive in silico benchmarks -- the generated molecules are more diverse, exhibit fewer steric clashes, and bind with higher affinity to protein pockets. The code is available at <a class="link-external link-https" href="https://github.com/genentech/voxbind/" rel="external noopener nofollow">this https URL</a>.
Machine Learning,Biomolecules
What problem does this paper attempt to address?
This paper mainly discusses the structure-based drug design method and proposes a novel generative model called VoxBind for generating 3D molecules (ligands) given a protein structure. Traditional virtual screening methods are inefficient in searching for efficient binders due to the exponential growth of chemical space with molecular size. Recent studies have proposed alternative generative models, but most of them focus on point cloud representations using SE(3) equivariant neural networks. The VoxBind model adopts a continuous density 3D voxel representation to represent molecules and utilizes a 3D voxel denoising network for learning and generation. It extends the neural empirical Bayes framework to a conditional setting and generates structurally conditioned molecules through a two-step process: sampling noise molecules from a Gaussian smoothed distribution and estimating clean ligands from the noise samples and pocket structures. This approach is simpler and faster in both training and sampling compared to the current state-of-the-art point cloud diffusion models, and it demonstrates better performance in a wide range of in vitro benchmark tests, producing molecules with higher affinity, greater diversity, fewer steric conflicts, and lower strain energy. Compared to existing point cloud methods, VoxBind has the following advantages: 1. Voxel representation allows the use of a flexible and scalable denoising architecture similar to image generation models. 2. Convolutional filters may better capture 3D patterns and shape complementarity, which is important in structure conditioning. 3. The noise process does not move atoms in space, naturally avoiding conflicts between generated ligands and pockets. 4. Only one fixed noise level is required, simplifying the training and sampling processes. In summary, this paper proposes a new, simple and effective generative model for structure-based drug design, improving existing methods through voxelization and denoising strategies, and demonstrating superior performance on various metrics in experiments.