Molecular relaxation by reverse diffusion with time step prediction

Khaled Kahouli,Stefaan Simon Pierre Hessmann,Klaus-Robert Müller,Shinichi Nakajima,Stefan Gugler,Niklas Wolf Andreas Gebauer
DOI: https://doi.org/10.1088/2632-2153/ad652c
2024-08-03
Abstract:Molecular relaxation, finding the equilibrium state of a non-equilibrium structure, is an essential component of computational chemistry to understand reactivity. Classical force field (FF) methods often rely on insufficient local energy minimization, while neural network FF models require large labeled datasets encompassing both equilibrium and non-equilibrium structures. As a remedy, we propose MoreRed, molecular relaxation by reverse diffusion, a conceptually novel and purely statistical approach where non-equilibrium structures are treated as noisy instances of their corresponding equilibrium states. To enable the denoising of arbitrarily noisy inputs via a generative diffusion model, we further introduce a novel diffusion time step predictor. Notably, MoreRed learns a simpler pseudo potential energy surface (PES) instead of the complex physical PES. It is trained on a significantly smaller, and thus computationally cheaper, dataset consisting of solely unlabeled equilibrium structures, avoiding the computation of non-equilibrium structures altogether. We compare MoreRed to classical FFs, equivariant neural network FFs trained on a large dataset of equilibrium and non-equilibrium data, as well as a semi-empirical tight-binding model. To assess this quantitatively, we evaluate the root-mean-square deviation between the found equilibrium structures and the reference equilibrium structures as well as their energies.
Chemical Physics,Machine Learning,Computational Physics
What problem does this paper attempt to address?
The paper aims to address the problem of molecular structure relaxation in molecular dynamics, specifically how to efficiently and accurately find the equilibrium structure corresponding to a non-equilibrium structure. Traditional methods such as classical force field (FF) methods often rely on the inadequacies of local energy minimization, while neural network force field models, although they can improve accuracy, require a large amount of labeled datasets, including data of both equilibrium and non-equilibrium structures. To solve the above problems, the authors propose a new method called MoreRed (Molecular Relaxation by Reverse Diffusion). MoreRed is a novel statistical method that views non-equilibrium structures as noisy versions of their corresponding equilibrium states and removes these noises through a reverse diffusion process to restore the equilibrium structure. The key to this method is that it does not require a complex physical potential energy surface (PES) but instead learns a simplified pseudo-potential energy surface, which significantly reduces the dataset required for training and only includes unlabeled equilibrium structures, greatly reducing the computational cost of generating training datasets. To achieve denoising of input structures at any noise level, i.e., restoring non-equilibrium structures to equilibrium structures, the authors also introduce a new diffusion timestep predictor. This predictor can estimate the degree to which the input structure deviates from the equilibrium state and determine the appropriate diffusion timestep accordingly, allowing the reverse diffusion process to proceed effectively. In the experimental section, the researchers tested using the QM7-X dataset and compared it with traditional force field methods, semi-empirical methods, and machine learning force field models. The results show that MoreRed can accurately map non-equilibrium structures back to the manifold of equilibrium structure data used during training, and it performs well even when the training data volume is much less than that of machine learning force field models. Additionally, MoreRed exhibits stronger robustness in handling variations of non-equilibrium test structures, successfully identifying the correct equilibrium structures. Overall, MoreRed provides a novel and efficient molecular relaxation method, particularly suitable for situations where only equilibrium structure data is available, offering a powerful tool for molecular dynamics simulations in computational chemistry.