Abstract:Machine learning has become a common and powerful tool in materials research. As more data become available, with the use of high-performance computing and high-throughput experimentation, machine learning has proven potential to accelerate scientific research and technology development. Though the uptake of data-driven approaches for materials science is at an exciting, early stage, to realize the true potential of machine learning models for successful scientific discovery, they must have qualities beyond purely predictive power. The predictions and inner workings of models should provide a certain degree of explainability by human experts, permitting the identification of potential model issues or limitations, building trust in model predictions, and unveiling unexpected correlations that may lead to scientific insights. In this work, we summarize applications of interpretability and explainability techniques for materials science and chemistry and discuss how these techniques can improve the outcome of scientific studies. We start by defining the fundamental concepts of interpretability and explainability in machine learning and making them less abstract by providing examples in the field. We show how interpretability in scientific machine learning has additional constraints compared to general applications. Building upon formal definitions in machine learning, we formulate the basic trade-offs among the explainability, completeness, and scientific validity of model explanations in scientific problems. In the context of these trade-offs, we discuss how interpretable models can be constructed, what insights they provide, and what drawbacks they have. We present numerous examples of the application of interpretable machine learning in a variety of experimental and simulation studies, encompassing first-principles calculations, physicochemical characterization, materials development, and integration into complex systems. We discuss the varied impacts and uses of interpretabiltiy in these cases according to the nature and constraints of the scientific study of interest. We discuss various challenges for interpretable machine learning in materials science and, more broadly, in scientific settings. In particular, we emphasize the risks of inferring causation or reaching generalization by purely interpreting machine learning models and the need for uncertainty estimates for model explanations. Finally, we showcase a number of exciting developments in other fields that could benefit interpretability in material science problems. Adding interpretability to a machine learning model often requires no more technical know-how than building the model itself. By providing concrete examples of studies (many with associated open source code and data), we hope that this Account will encourage all practitioners of machine learning in materials science to look deeper into their models.

Recent advances in the SISSO method and their implementation in the SISSO++ code

TorchSISSO: A PyTorch-Based Implementation of the Sure Independence Screening and Sparsifying Operator for Efficient and Interpretable Model Discovery

SISSO: A compressed-sensing method for identifying the best low-dimensional descriptor in an immensity of offered candidates

i-SISSO: Mutual information-based improved sure independent screening and sparsifying operator algorithm

Enabling Research through the SCIP Optimization Suite 8.0

Materials-Discovery Workflows Guided by Symbolic Regression: Identifying Acid-Stable Oxides for Electrocatalysis

Smoothing Methods for Automatic Differentiation Across Conditional Branches

Combining genetic algorithm and compressed sensing for features and operators selection in symbolic regression

SSIA: A Sensitivity-Supervised Interlock Algorithm for High-Performance Microkinetic Solving.

An intelligent metaphor-free spatial information sampling algorithm for balancing exploitation and exploration

Interpretable and Explainable Machine Learning for Materials Science and Chemistry

Towards Modelling and Verification of Social Explainable AI

PySCIPOpt-ML: Embedding Trained Machine Learning Models into Mixed-Integer Programs

The SCIP Optimization Suite 9.0

S3LLM: Large-Scale Scientific Software Understanding with LLMs using Source, Metadata, and Document

LASSI: An LLM-based Automated Self-Correcting Pipeline for Translating Parallel Scientific Codes

DScribe: Library of descriptors for machine learning in materials science

Accelerating Materials-Space Exploration for Thermal Insulators by Mapping Materials Properties via Artificial Intelligence

Scientific Inference With Interpretable Machine Learning: Analyzing Models to Learn About Real-World Phenomena

CodeSift: An LLM-Based Reference-Less Framework for Automatic Code Validation

Scientific AI in materials science: a path to a sustainable and scalable paradigm