Abstract:Machine learning has become a common and powerful tool in materials research. As more data become available, with the use of high-performance computing and high-throughput experimentation, machine learning has proven potential to accelerate scientific research and technology development. Though the uptake of data-driven approaches for materials science is at an exciting, early stage, to realize the true potential of machine learning models for successful scientific discovery, they must have qualities beyond purely predictive power. The predictions and inner workings of models should provide a certain degree of explainability by human experts, permitting the identification of potential model issues or limitations, building trust in model predictions, and unveiling unexpected correlations that may lead to scientific insights. In this work, we summarize applications of interpretability and explainability techniques for materials science and chemistry and discuss how these techniques can improve the outcome of scientific studies. We start by defining the fundamental concepts of interpretability and explainability in machine learning and making them less abstract by providing examples in the field. We show how interpretability in scientific machine learning has additional constraints compared to general applications. Building upon formal definitions in machine learning, we formulate the basic trade-offs among the explainability, completeness, and scientific validity of model explanations in scientific problems. In the context of these trade-offs, we discuss how interpretable models can be constructed, what insights they provide, and what drawbacks they have. We present numerous examples of the application of interpretable machine learning in a variety of experimental and simulation studies, encompassing first-principles calculations, physicochemical characterization, materials development, and integration into complex systems. We discuss the varied impacts and uses of interpretabiltiy in these cases according to the nature and constraints of the scientific study of interest. We discuss various challenges for interpretable machine learning in materials science and, more broadly, in scientific settings. In particular, we emphasize the risks of inferring causation or reaching generalization by purely interpreting machine learning models and the need for uncertainty estimates for model explanations. Finally, we showcase a number of exciting developments in other fields that could benefit interpretability in material science problems. Adding interpretability to a machine learning model often requires no more technical know-how than building the model itself. By providing concrete examples of studies (many with associated open source code and data), we hope that this Account will encourage all practitioners of machine learning in materials science to look deeper into their models.

Interpretable and explainable predictive machine learning models for data-driven protein engineering

Machine-learning-guided directed evolution for protein engineering

Machine Learning-Guided Protein Engineering

Artificial intelligence-aided protein engineering: from topological data analysis to deep protein language models

Adaptive machine learning for protein engineering

Machine learning-guided directed evolution for protein engineering

ExplainableFold: Understanding AlphaFold Prediction with Explainable AI

Recent advances in interpretable machine learning using structure-based protein representations

Revolutionizing Molecular Design for Innovative Therapeutic Applications through Artificial Intelligence

Interpretable and Explainable Machine Learning for Materials Science and Chemistry

Machine learning for functional protein design

Explainable AI for Bioinformatics: Methods, Tools, and Applications

In Silico Protein Function Prediction: the Rise of Machine Learning-Based Approaches

Interpretable machine learning methods for predictions in systems biology from omics data

Prediction Machines: Applied Machine Learning for Therapeutic Protein Design and Development

Thermodynamics-inspired explanations of artificial intelligence

[Advances in machine learning for predicting protein functions].

Machine Learning for Protein Engineering

A Perspective on Explanations of Molecular Prediction Models

Evolutionary context-integrated deep sequence modeling for protein engineering

Learning Strategies in Protein Directed Evolution