Data-efficient modeling of catalytic reactions via enhanced sampling and on-the-fly learning of machine learning potentials

Luigi Bonati,Simone Perego

DOI: https://doi.org/10.26434/chemrxiv-2024-nsp7n

2024-06-06

Abstract:Simulating catalytic reactivity under operative conditions poses a significant challenge due to the dynamic nature of the catalysts and the high computational cost of electronic structure calculations. Machine learning potentials offer a promising avenue to simulate dynamics at a fraction of the cost, but they require datasets containing all relevant configurations, particularly reactive ones. Here we present a scheme to construct reactive potentials in a data-efficient manner. This is achieved by combining enhanced sampling methods first with Gaussian processes to discover transition paths and then with graph neural networks to obtain a uniformly accurate description. The necessary configurations are extracted via an active learning procedure based on local environment uncertainty. We validated our approach by studying several reactions related to the decomposition of ammonia on iron-cobalt alloy catalysts. Our scheme proved efficient, requiring only ~1,000 DFT calculations per reaction, and robust, sampling reactive configurations from the different accessible pathways. Using this potential, we calculated free energy profiles and characterized reaction mechanisms, showing the ability to provide microscopic insights into complex processes under dynamic conditions.

Chemistry

What problem does this paper attempt to address?

This paper attempts to solve two main problems in catalytic reaction simulation: 1. **Simulation challenges of dynamic catalysts**: Under actual operating conditions, the active sites of the catalyst will continuously evolve with the change of reaction conditions, which leads to complex changes in the microstructure of the catalyst. These changes include surface diffusion, reconstruction caused by reactants or adsorbates, etc. Capturing these dynamic phenomena is not only extremely difficult experimentally, but also very challenging in computational modeling. 2. **High computational cost**: In order to comprehensively describe these dynamic effects, an accurate quantum mechanics (QM) model is required to describe the potential energy surface, and simulations on extended time and space scales are also needed. However, these are often conflicting requirements. Although the machine - learning (ML) potential function can simulate dynamics at a lower cost, its effect depends on the quality of the training data set, especially the need to include all relevant configurations, especially reactive configurations. Identifying these configurations is particularly difficult in a complex dynamic environment, especially under operating conditions, where there may be multiple sets of transition - state configurations. To solve these problems, the paper proposes an efficient method for constructing reactive potential functions. By combining enhanced sampling methods and machine - learning techniques, an accurate reactive potential function can be constructed with the least amount of quantum - mechanical calculations. Specifically, this method is divided into two stages: - **Exploration stage**: Use enhanced sampling methods (such as OPES - flooding) and uncertainty - based molecular dynamics (MD) simulations to discover reaction paths and transition - state structures. - **Accurate description stage**: Utilize graph neural networks (GNN) and Gaussian - process - based (GP) active - learning strategies to improve the uniform description accuracy of reaction paths. Through this method, the paper shows how in the ammonia decomposition reaction, only about 1,000 DFT calculations are required to efficiently and accurately construct the reactive potential function and calculate the free - energy profile, revealing the reaction mechanism. This method not only significantly reduces the required computational resources, but also can provide in - depth insights at the microscopic level under dynamic conditions.

Data-efficient modeling of catalytic reactions via enhanced sampling and on-the-fly learning of machine learning potentials

Machine learning models predict calculation outcomes with the transferability necessary for computational catalysis

Machine-Learning-Accelerated DFT Conformal Sampling of Catalytic Processes

Controlling neural network extrapolation enables efficient and comprehensive sampling of coverage effects in catalysis

Exploring catalytic reaction networks with machine learning

Chapter 19. Machine Learning for Heterogeneous Catalysis: Global Neural Network Potential from Construction to Applications

Comprehensive sampling of coverage effects in catalysis by leveraging generalization in neural network models

Predicting binding motifs of complex adsorbates using machine learning with a physics-inspired graph representation

Machine Learning Potentials for Heterogeneous Catalysis

The Future of Computational Catalysis

A transferable active-learning strategy for reactive molecular force fields

Multitask Machine Learning of Collective Variables for Enhanced Sampling of Rare Events

Designing catalysts with deep generative models and computational data. A case study for Suzuki cross coupling reactions

Developing General Reactive Element-Based Machine Learning Potentials as the Main Computational Engine for Heterogeneous Catalysis

Active learning meets metadynamics: Automated workflow for reactive machine learning potentials

A Foundational Model for Reaction Networks on Metal Surfaces

Adsorption Enthalpies for Catalysis Modeling through Machine-Learned Descriptors

Machine-Learning-Augmented Chemisorption Model for CO2 Electroreduction Catalyst Screening.

Catalyst design using actively learned machine with non-ab initio input features towards CO2 reduction reactions

AdsorbML: A Leap in Efficiency for Adsorption Energy Calculations using Generalizable Machine Learning Potentials

Predicting Chemical Reaction Barriers with a Machine Learning Model