Equivariant score-based generative diffusion framework for 3D molecules

Hao Zhang,Yang Liu,Xiaoyan Liu,Cheng Wang,Maozu Guo
DOI: https://doi.org/10.1186/s12859-024-05810-w
IF: 3.307
2024-06-01
BMC Bioinformatics
Abstract:Molecular biology is crucial for drug discovery, protein design, and human health. Due to the vastness of the drug-like chemical space, depending on biomedical experts to manually design molecules is exceedingly expensive. Utilizing generative methods with deep learning technology offers an effective approach to streamline the search space for molecular design and save costs. This paper introduces a novel E(3)-equivariant score-based diffusion framework for 3D molecular generation via SDEs, aiming to address the constraints of unified Gaussian diffusion methods. Within the proposed framework EMDS, the complete diffusion is decomposed into separate diffusion processes for distinct components of the molecular feature space, while the modeling processes also capture the complex dependency among these components. Moreover, angle and torsion angle information is integrated into the networks to enhance the modeling of atom coordinates and utilize spatial information more effectively.
biochemical research methods,biotechnology & applied microbiology,mathematical & computational biology
What problem does this paper attempt to address?
The paper proposes a new approach to solve the problem of 3D molecular generation. Molecular biology is crucial in drug discovery, protein design, and human health. However, due to the enormous chemical space of drug-like molecules, manual design by experts is both costly and inefficient. Therefore, researchers utilize generative methods in deep learning to optimize the search space for molecular design and reduce costs. The paper introduces a framework called E(3)-equivariant score-based diffusion (EMDS), which generates 3D molecules using stochastic differential equations (SDEs). It aims to overcome the limitations of the Gaussian diffusion method. In this framework, the entire diffusion process is decomposed into independent diffusion processes of different components in the molecular feature space, taking into account the complex dependencies between these components. Additionally, the paper introduces angle and torsion angle information to enhance atomic coordinate modeling and make more efficient use of spatial information. Experimental results demonstrate that EMDS outperforms existing 3D molecular generation methods comprehensively on the QM9 dataset, showing excellent performance across all evaluation metrics. Through ablation experiments, the effectiveness of key components of the framework and the improvement of molecular generation performance by angle and torsion angle information are further proven. This method performs efficiently in generating molecules close to real-life scenarios and provides new opportunities for solving challenging biomedical molecular and protein problems. In summary, the paper aims to address the problem of how to generate drug molecules more efficiently and accurately through an innovative 3D molecular generation framework that considers interactions between different components in the molecular space. This framework improves the efficiency and accuracy of drug design.