LDMol: Text-to-Molecule Diffusion Model with Structurally Informative Latent Space

Jinho Chang,Jong Chul Ye

2024-10-03

Abstract:With the emergence of diffusion models as the frontline of generative models, many researchers have proposed molecule generation techniques with conditional diffusion models. However, the unavoidable discreteness of a molecule makes it difficult for a diffusion model to connect raw data with highly complex conditions like natural language. To address this, we present a novel latent diffusion model dubbed LDMol for text-conditioned molecule generation. LDMol comprises a molecule autoencoder that produces a learnable and structurally informative feature space, and a natural language-conditioned latent diffusion model. In particular, recognizing that multiple SMILES notations can represent the same molecule, we employ a contrastive learning strategy to extract feature space that is aware of the unique characteristics of the molecule structure. LDMol outperforms the existing baselines on the text-to-molecule generation benchmark, suggesting a potential for diffusion models can outperform autoregressive models in text data generation with a better choice of the latent domain. Furthermore, we show that LDMol can be applied to downstream tasks such as molecule-to-text retrieval and text-guided molecule editing, demonstrating its versatility as a diffusion model.

Machine Learning,Artificial Intelligence

What problem does this paper attempt to address?

### The Problem the Paper Attempts to Solve This paper aims to address a key issue in molecular generation: how to utilize diffusion models to generate valid molecular structures given natural language conditions. Specifically: 1. **The Discreteness of Molecular Data**: Molecular data inherently possess discreteness (such as atom types, bond types, and connectivity), while existing diffusion models are primarily studied for continuous data domains (like images). This makes it challenging to directly apply diffusion models to molecular generation. 2. **Generation from Text to Molecules**: Currently, most diffusion model-based methods can only handle relatively simple chemical or biological conditions, and their ability to generate molecules under natural language conditions is weak. In contrast, autoregressive models perform better in this task. To overcome these issues, the authors propose a new Latent Space Diffusion Model (LDMol) for text-conditioned molecular generation. LDMol includes a molecular autoencoder that can generate a feature space rich in structural information and incorporates a latent space diffusion model based on natural language conditions. Additionally, a contrastive learning strategy is used to extract features of molecular structures, enabling the model to better understand and generate valid molecules that meet the given text conditions. Experimental results show that LDMol outperforms existing autoregressive models in text-conditioned molecular generation benchmarks and demonstrates its potential applications in downstream tasks such as molecule-to-text retrieval and text-guided molecular editing.

LDMol: Text-to-Molecule Diffusion Model with Structurally Informative Latent Space

Text-Guided Molecule Generation with Diffusion Language Model

3M-Diffusion: Latent Multi-Modal Diffusion for Language-Guided Molecular Structure Generation

A Unified Conditional Diffusion Framework for Dual Protein Targets Based Bioactive Molecule Generation

Multimodal Latent Language Modeling with Next-Token Diffusion

Geometric Latent Diffusion Models for 3D Molecule Generation

LMDM:Latent Molecular Diffusion Model For 3D Molecule Generation

Text-guided Diffusion Model for 3D Molecule Generation

GLDM: hit molecule generation with constrained graph latent diffusion model

Latent Diffusion For Conditional Generation of Molecules

Text-guided Small Molecule Generation Via Diffusion Model

Chemical Language Model Linker: blending text and molecules with modular adapters

Generation of 3D Molecules in Pockets via Language Model

Learning Joint 2D & 3D Diffusion Models for Complete Molecule Generation

MolLM : a unified language model for integrating biomedical text with 2D and 3D molecular representations

MolDiff: Addressing the Atom-Bond Inconsistency Problem in 3D Molecule Diffusion Generation

Diffusion-Driven Generative Framework for Molecular Conformation Prediction

A Cheaper and Better Diffusion Language Model with Soft-Masked Noise

Bridging the Gap between Learning and Inference for Diffusion-Based Molecule Generation

Text-Guided Multi-Property Molecular Optimization with a Diffusion Language Model

ControlMol: Adding Substruture Control To Molecule Diffusion Models