3D Molecular Geometry Analysis with 2D Graphs

Zhao Xu,Yaochen Xie,Youzhi Luo,Xuan Zhang,Xinyi Xu,Meng Liu,Kaleb Dickerson,Cheng Deng,Maho Nakata,Shuiwang Ji
2023-05-02
Abstract:Ground-state 3D geometries of molecules are essential for many molecular analysis tasks. Modern quantum mechanical methods can compute accurate 3D geometries but are computationally prohibitive. Currently, an efficient alternative to computing ground-state 3D molecular geometries from 2D graphs is lacking. Here, we propose a novel deep learning framework to predict 3D geometries from molecular graphs. To this end, we develop an equilibrium message passing neural network (EMPNN) to better capture ground-state geometries from molecular graphs. To provide a testbed for 3D molecular geometry analysis, we develop a benchmark that includes a large-scale molecular geometry dataset, data splits, and evaluation protocols. Experimental results show that EMPNN can efficiently predict more accurate ground-state 3D geometries than RDKit and other deep learning methods. Results also show that the proposed framework outperforms self-supervised learning methods on property prediction tasks.
Chemical Physics,Artificial Intelligence,Machine Learning
What problem does this paper attempt to address?
The paper aims to address the following key issues: ### Research Background and Objectives - **Importance of 3D Molecular Geometries**: Ground-state 3D molecular geometries are crucial for many molecular analysis tasks. They provide additional spatial information that helps predict properties closely related to molecular shapes. - **Limitations of Quantum Chemistry Methods**: Modern quantum chemistry methods can accurately calculate 3D molecular geometries, but they are computationally expensive and time-consuming. - **Shortcomings of Existing Methods**: - There is currently a lack of effective alternatives for efficiently computing ground-state 3D molecular geometries from 2D molecular graphs. - Existing generative methods tend to produce multiple non-ground-state molecular geometries rather than focusing on predicting the most stable ground-state structures. ### Main Research Content - **Proposed New Framework**: The paper proposes a novel deep learning framework to directly predict the ground-state 3D geometries of molecules from 2D molecular graphs, using a method called Equilibrium Message Passing Neural Network (EMPNN). - **Two-Stage Prediction Framework** (Geometry-Aware Prediction, GAP): 1. **3D Geometry Prediction**: Directly predict the 3D coordinates of molecular atoms, rather than indirectly reconstructing coordinates by predicting the distance matrix between atoms within the molecule. 2. **Downstream Property Prediction Based on Predicted Geometries**: Use the predicted ground-state 3D geometries to assist in molecular property prediction. ### Key Contributions 1. **Introduction of a New Problem**: Predicting ground-state 3D molecular geometries from 2D molecular graphs. 2. **Two-Stage Framework**: Includes EMPNN for predicting ground-state 3D geometries and a 3D graph neural network for property prediction based on the predicted geometries. 3. **Benchmark Dataset**: Constructed a large-scale dataset named Molecule3D, containing numerous molecules with accurate ground-state geometries, along with corresponding data splits and evaluation protocols. ### Experimental Results - **Accuracy of 3D Geometry Prediction**: Experimental results show that EMPNN is more accurate and computationally faster in predicting ground-state 3D geometries compared to RDKit and other deep learning methods. - **Property Prediction Performance**: The proposed framework outperforms existing self-supervised learning methods on property prediction tasks in the MoleculeNet dataset, demonstrating the value and importance of the 3D geometry prediction problem. In summary, the main goal of this paper is to develop an efficient and accurate method to predict the ground-state 3D geometries of molecules from 2D molecular graphs and to demonstrate the potential application of this method in molecular property prediction.