Developing General Reactive Element-Based Machine Learning Potentials as the Main Computational Engine for Heterogeneous Catalysis

Peijun Hu,Changxi Yang,Chenyu Wu,Wenbo Xie,Daiqian Xie
DOI: https://doi.org/10.26434/chemrxiv-2024-r8l6j
2024-10-14
Abstract:Machine learning potentials (MLPs) have emerged as a promising technique to significantly enhance efficiency by replacing computationally expensive quantum mechanical calculations. However, developing truly universal MLPs remains challenging, as the consensus is that MLPs can only be used for similar structures that they have been trained on, while the vast and diverse chemical space is difficult to fully sample using the common system-dependent sampling methods. Here, our approach leverages a unique random exploration via imaginary chemicals optimization (REICO) strategy, which enables unbiased exploration of chemical space by focusing on atomic interactions. The resulting EMLP is inherently general and reactive, capable of accurately predicting elementary reactions without explicit structural or reaction pathway inputs. Benchmarked across various representative calculations of heterogeneous catalysis, our EMLP achieves quantitative agreement with density functional theory (DFT) calculations. This demonstrates the potential of EMLP as a powerful, general, and user-friendly tool for modeling complex chemical systems, paving the way to replace DFT calculations for large and intricate systems. Our approach is also applicable to broader fields such as materials science and molecular biology, representing a paradigm shift in MLPs-related research.
Chemistry
What problem does this paper attempt to address?
The paper aims to address the following issues: 1. **Generality and Reactivity Challenges**: Traditional Machine Learning Potentials (MLPs), although significantly improving efficiency, are limited to specific structures within the training dataset and lack the ability to generalize to unknown structures. Additionally, existing MLPs struggle to handle complex chemical reaction pathways, particularly in the sampling of transition states. 2. **Big Data Sampling Problem**: System-dependent sampling methods lead to data redundancy and overfitting issues, limiting the application range of MLPs. For complex catalytic systems, such as multi-component catalysts or reaction systems involving solvent environments, traditional sampling methods fail to effectively cover a wide chemical space. To address these issues, the authors propose an Element-based Machine Learning Potential (EMLP), which constructs a highly balanced and general dataset through a unique Random Exploration and In-silico Chemical Optimization (REICO) strategy. This approach enables EMLP to predict the energy and reaction kinetics of arbitrary structures without relying on specific reaction pathways, demonstrating accuracy comparable to Density Functional Theory (DFT) in various heterogeneous catalytic reactions while significantly reducing computational costs. This method is not only applicable to the field of catalysis but also has broad application prospects, including materials science and molecular biology.