3DReact: Geometric deep learning for chemical reactions

Puck van Gerwen,Ksenia R. Briling,Charlotte Bunne,Vignesh Ram Somnath,Ruben Laplaza,Andreas Krause,Clemence Corminboeuf
DOI: https://doi.org/10.1021/acs.jcim.4c00104
2024-07-12
Abstract:Geometric deep learning models, which incorporate the relevant molecular symmetries within the neural network architecture, have considerably improved the accuracy and data efficiency of predictions of molecular properties. Building on this success, we introduce 3DReact, a geometric deep learning model to predict reaction properties from three-dimensional structures of reactants and products. We demonstrate that the invariant version of the model is sufficient for existing reaction datasets. We illustrate its competitive performance on the prediction of activation barriers on the GDB7-22-TS, Cyclo-23-TS and Proparg-21-TS datasets in different atom-mapping regimes. We show that, compared to existing models for reaction property prediction, 3DReact offers a flexible framework that exploits atom-mapping information, if available, as well as geometries of reactants and products (in an invariant or equivariant fashion). Accordingly, it performs systematically well across different datasets, atom-mapping regimes, as well as both interpolation and extrapolation tasks.
Chemical Physics,Machine Learning
What problem does this paper attempt to address?
The main aim of this paper is to address the problem of predicting chemical reaction properties, particularly by leveraging 3D structural information to improve prediction accuracy and data efficiency. Specifically, the paper introduces a geometric deep learning model called 3DReact, which can predict the properties of chemical reactions (using activation energy as an example) from the 3D structures of reactants and products. Below is an overview of the key issues the paper attempts to address: 1. **Integrating Chemical and Physical Priors**: Current machine learning models for predicting chemical reaction properties either rely on atom mapping information (providing chemical priors) or 3D geometric information (providing physical priors). However, no model can integrate both types of priors simultaneously. The 3DReact model attempts to fill this gap by designing a model that can encode both the 3D structures of reactants and products and the atom mapping information or its proxies. 2. **Improving Prediction Performance**: By validating the performance of the 3DReact model on three different datasets (GDB7-22-TS, Cyclo-23-TS, and Proparg-21-TS), the study shows that the model can provide accurate and reliable predictions under different atom mapping conditions. This indicates that the model performs well even in the absence of high-quality atom mapping information. 3. **Reducing Dependence on Geometry Quality**: The 3DReact model aims to reduce dependence on the quality of input 3D geometry, which is particularly important given the diversity of data in practical applications. 4. **Stable Extrapolation Behavior**: In addition to performing well on existing data, 3DReact also demonstrates stable extrapolation capabilities, meaning it can maintain good predictive performance when handling new data beyond the training data range. In summary, the goal of this paper is to develop a deep learning model that can effectively predict chemical reaction properties, particularly excelling in handling 3D structural information, and can flexibly handle atom mapping information of varying quality and availability.