Latent Diffusion For Conditional Generation of Molecules
Benjamin Kaufman,Edward C. Williams,Ryan Pederson,Carl Underkoffler,Zahid Panjwani,Miles Wang-Henderson,Narbe Mardirossian,Matthew H. Katcher,Zack Strater,Jean-Marc Grandjean,Bryan Lee,John Parkhill
DOI: https://doi.org/10.1101/2024.08.22.609169
2024-08-22
Abstract:Designing a small molecule therapeutic is a challenging multi-parameter optimization problem. Key properties, such as potency, selectivity, bioavailability, and safety must be jointly optimized to deliver an effective clinical candidate. We present COATI-LDM, a novel application of latent diffusion models to the conditional generation of property-optimized, drug-like small molecules. Diffusive generation of latent molecular encodings, rather than direct diffusive generation of molecular structures, offers an appealing way to handle the small and mismatched datasets that are common for molecular properties. We benchmark various diffusion guidance schemes and sampling methods against a pre-trained autoregressive transformer and genetic algorithms to evaluate control over potency, expert preference, and various physicochemical properties. We show that conditional diffusion allows control over the properties of generated molecules, with practical and performance advantages over competing methods. We also apply the recently introduced idea of particle guidance to enhance sample diversity. We prospectively survey a panel of medicinal chemists and determine that we can conditionally generate molecules that align with their preferences via a learned preference score. Finally, we present a partial diffusion method for the local optimization of molecular properties starting from a seed molecule. Conditional generation of small molecules using latent diffusion models on molecular encodings provides a highly practical and flexible alternative to prior molecular generation schemes.
Bioinformatics