MAGNet: Motif-Agnostic Generation of Molecules from Shapes

Leon Hetzel,Johanna Sommer,Bastian Rieck,Fabian Theis,Stephan Günnemann

2023-11-07

Abstract:Recent advances in machine learning for molecules exhibit great potential for facilitating drug discovery from in silico predictions. Most models for molecule generation rely on the decomposition of molecules into frequently occurring substructures (motifs), from which they generate novel compounds. While motif representations greatly aid in learning molecular distributions, such methods struggle to represent substructures beyond their known motif set. To alleviate this issue and increase flexibility across datasets, we propose MAGNet, a graph-based model that generates abstract shapes before allocating atom and bond types. To this end, we introduce a novel factorisation of the molecules' data distribution that accounts for the molecules' global context and facilitates learning adequate assignments of atoms and bonds onto shapes. Despite the added complexity of shape abstractions, MAGNet outperforms most other graph-based approaches on standard benchmarks. Importantly, we demonstrate that MAGNet's improved expressivity leads to molecules with more topologically distinct structures and, at the same time, diverse atom and bond assignments.

Chemical Physics,Machine Learning

What problem does this paper attempt to address?

The paper aims to address issues in the process of molecular generation, particularly the challenges existing methods face when dealing with substructures beyond the known set of motifs. Specifically: - **Problems with existing methods**: Most existing molecular generation models rely on decomposing molecules into frequently occurring substructures (called motifs) and then generating new compounds from these motifs. While motif representation helps in learning molecular distributions, these methods struggle to represent substructures that go beyond their known set of motifs. - **Proposed new method**: To overcome this limitation and improve the model's flexibility across different datasets, the paper proposes MAGNet, a graph-based model that first generates abstract shapes and then assigns atom and bond types. By introducing a new factorization method for molecular data distribution, MAGNet can consider the overall context of the molecule and learn appropriate atom and bond assignments. The paper demonstrates that MAGNet not only outperforms most other graph-based methods in standard benchmarks but also generates molecules with more topologically diverse structures and varied atom and bond assignments. Additionally, MAGNet shows superior performance in reconstructing uncommon structures, such as macrocycles, indicating its advantage in capturing complex structures.

MAGNet: Motif-Agnostic Generation of Molecules from Shapes

Molecule Generation For Target Protein Binding with Structural Motifs

An Equivariant Generative Framework for Molecular Graph-Structure Co-Design

Domain-Agnostic Molecular Generation with Chemical Feedback

The power of motifs as inductive bias for learning molecular distributions

3DMolNet: A Generative Network for Molecular Structures

Molformer: Motif-based Transformer on 3D Heterogeneous Molecular Graphs

Learning to Extend Molecular Scaffolds with Structural Motifs

Molecular Graph Generation via Geometric Scattering

Conditional Molecular Generation Net Enables Automated Structure Elucidation Based on 13C NMR Spectra and Prior Knowledge.

MOLUCINATE: A Generative Model for Molecules in 3D Space

Equivariant Shape-Conditioned Generation of 3D Molecules for Ligand-Based Drug Design

Learning Neural Generative Dynamics for Molecular Conformation Generation

Generation of 3D Molecules in Pockets via Language Model

Data-Efficient Graph Grammar Learning for Molecular Generation

MolGrapher: Graph-based Visual Recognition of Chemical Structures

De Novo Molecular Generation via Connection-aware Motif Mining

Learning Latent Space Energy-Based Prior Model for Molecule Generation

A Knowledge-Driven Self-Supervised Approach for Molecular Generation

Generation of 3D molecules in pockets via a language model