Abstract:Heterogeneous catalysts are rather complex materials that come in many classes (e.g., metals, oxides, carbides) and shapes. At the same time, the interaction of the catalyst surface with even a relatively simple gas-phase environment such as syngas (CO and H2) may already produce a wide variety of reaction intermediates ranging from atoms to complex molecules. The starting point for creating predictive maps of, e.g., surface coverages or chemical activities of potential catalyst materials is the reliable prediction of adsorption enthalpies of all of these intermediates. For simple systems, direct density functional theory (DFT) calculations are currently the method of choice. However, a wider exploration of complex materials and reaction networks generally requires enthalpy predictions at lower computational cost.The use of machine learning (ML) and related techniques to make accurate and low-cost predictions of quantum-mechanical calculations has gained increasing attention lately. The employed approaches span from physically motivated models over hybrid physics-ΔML approaches to complete black-box methods such as deep neural networks. In recent works we have explored the possibilities for using a compressed sensing method (Sure Independence Screening and Sparsifying Operator, SISSO) to identify sparse (low-dimensional) descriptors for the prediction of adsorption enthalpies at various active-site motifs of metals and oxides. We start from a set of physically motivated primary features such as atomic acid/base properties, coordination numbers, or band moments and let the data and the compressed sensing method find the best algebraic combination of these features. Here we take this work as a starting point to categorize and compare recent ML-based approaches with a particular focus on model sparsity, data efficiency, and the level of physical insight that one can obtain from the model.Looking ahead, while many works to date have focused only on the mere prediction of databases of, e.g., adsorption enthalpies, there is also an emerging interest in our field to start using ML predictions to answer fundamental science questions about the functioning of heterogeneous catalysts or perhaps even to design better catalysts than we know today. This task is significantly simplified in works that make use of scaling-relation-based models (volcano curves), where the model outcome is determined by only one or two adsorption enthalpies and which consequently become the sole target for ML-based high-throughput screening or design. However, the availability of cheap ML energetics also allows going beyond scaling relations. On the basis of our own work in this direction, we will discuss the additional physical insight that can be achieved by integrating ML-based predictions with traditional catalysis modeling techniques from thermal and electrocatalysis, such as the computational hydrogen electrode and microkinetic modeling, as well as the challenges that lie ahead.

Catlas: an automated framework for catalyst discovery demonstrated for direct syngas conversion

Computational catalyst discovery: Active classification through myopic multiscale sampling

Machine-Learning-Driven High-Entropy Alloy Catalyst Discovery to Circumvent the Scaling Relation for CO2 Reduction Reaction

Machine-learning-accelerated Discovery of Single-Atom Catalysts Based on Bidirectional Activation Mechanism

Accelerated Design of Nickel-Cobalt Based Catalysts for CO2 Hydrogenation with Human-in-the-Loop Active Machine Learning

Adsorption Enthalpies for Catalysis Modeling through Machine-Learned Descriptors

Integrating Active Learning and DFT for Fast-Tracking Single-Atom Alloy Catalysts in CO2-to-Fuel Conversion

Machine-Learning-Accelerated DFT Conformal Sampling of Catalytic Processes

Chemical imaging of Fischer-Tropsch catalysts under operating conditions

Machine Learning-Assisted Screening of Stepped Alloy Surfaces for C 1 Catalysis

Designing catalysts with deep generative models and computational data. A case study for Suzuki cross coupling reactions

Open Challenges in Developing Generalizable Large Scale Machine Learning Models for Catalyst Discovery

Invariant Molecular Representations for Heterogeneous Catalysis

Data-Driven Prediction of Configurational Stability of Molecule-Adsorbed Heterogeneous Catalysts

Catalyst Design by Scanning Probe Block Copolymer Lithography.

Interpretable Machine Learning for Catalytic Materials Design toward Sustainability

Explainable Data-driven Modeling of Adsorption Energy in Heterogeneous Catalysis

Machine Learning-Driven High-Throughput Screening of Alloy-Based Catalysts for Selective CO2 Hydrogenation to Methanol

Open Challenges in Developing Generalizable Large-Scale Machine-Learning Models for Catalyst Discovery

Data-driven design of new catalytic materials in methane oxidation based on a site isolation concept

Automated transition metal catalysts discovery and optimisation with AI and Machine Learning