3D molecule generation by denoising voxel grids

Pedro O. Pinheiro,Joshua Rackers,Joseph Kleinhenz,Michael Maser,Omar Mahmood,Andrew Martin Watkins,Stephen Ra,Vishnu Sresht,Saeed Saremi
2024-03-09
Abstract:We propose a new score-based approach to generate 3D molecules represented as atomic densities on regular grids. First, we train a denoising neural network that learns to map from a smooth distribution of noisy molecules to the distribution of real molecules. Then, we follow the neural empirical Bayes framework (Saremi and Hyvarinen, 19) and generate molecules in two steps: (i) sample noisy density grids from a smooth distribution via underdamped Langevin Markov chain Monte Carlo, and (ii) recover the "clean" molecule by denoising the noisy grid with a single step. Our method, VoxMol, generates molecules in a fundamentally different way than the current state of the art (ie, diffusion models applied to atom point clouds). It differs in terms of the data representation, the noise model, the network architecture and the generative modeling algorithm. Our experiments show that VoxMol captures the distribution of drug-like molecules better than state of the art, while being faster to generate samples.
Machine Learning,Quantitative Methods
What problem does this paper attempt to address?
The problem addressed in this paper is 3D molecular generation, that is, how to create a continuous atomic density model representing chemical compounds in three-dimensional space. Existing methods typically rely on diffusion models on point clouds, but these approaches have limitations such as requiring prior knowledge of the number of atoms, handling different types of features (such as continuous and discrete), and difficulties in capturing long-range dependencies. The paper proposes a new approach called VoxMol, which uses denoising voxel grid techniques to generate 3D molecules. VoxMol trains a neural network to sample from a smooth noise distribution and denoise voxelized molecular grids, then generates clean molecular samples using a "walk-jump" sampling strategy. This approach differs from current techniques in data representation, noise modeling, network architecture, and generative modeling algorithms, and performs better in dealing with larger, drug-scale molecules. The paper demonstrates that VoxMol outperforms existing methods in generating drug-like molecules with desired properties.