Abstract:Interpretability and explainability are essential principles of machine learning model and method design and development for medicine, economics, law, and natural sciences applications. Over the last 30 years, many techniques motivated by these properties have been developed. This review is intended for a general machine learning audience interested in exploring the challenges of interpretation and explanation beyond the logistic regression or random forest variable importance. We will examine inductive biases behind interpretable and explainable machine learning and illustrate them with concrete examples from the literature. Interpretability and explainability are crucial for machine learning (ML) and statistical applications in medicine, economics, law, and natural sciences and form an essential principle for ML model design and development. Although interpretability and explainability have escaped a precise and universal definition, many models and techniques motivated by these properties have been developed over the last 30 years, with the focus currently shifting toward deep learning. We will consider concrete examples of state‐of‐the‐art, including specially tailored rule‐based, sparse, and additive classification models, interpretable representation learning, and methods for explaining black‐box models post hoc. The discussion will emphasize the need for and relevance of interpretability and explainability, the divide between them, and the inductive biases behind the presented "zoo" of interpretable models and explanation methods. This article is categorized under: Fundamental Concepts of Data and Knowledge > Explainable AI Technologies > Machine Learning Commercial, Legal, and Ethical Issues > Social Considerations

Thermodynamics-inspired explanations of artificial intelligence

Thermodynamics-inspired Explanations of Artificial Intelligence

Explainable Artificial Intelligence: Understanding, Visualizing and Interpreting Deep Learning Models

Explaining Explanations: An Overview of Interpretability of Machine Learning

Explainable AI: A Review of Machine Learning Interpretability Methods

Explaining machine learning models using entropic variable projection

Machine Explanations and Human Understanding

An Evaluation of the Human-Interpretability of Explanation

A Perspective on Explanations of Molecular Prediction Models

Explain To Decide: A Human-Centric Review on the Role of Explainable Artificial Intelligence in AI-assisted Decision Making

Model-Agnostic Interpretability of Machine Learning

Disentangled Explanations of Neural Network Predictions by Finding Relevant Subspaces

Explainable Artificial Intelligence and Machine Learning: A reality rooted perspective

Distance-Restricted Explanations: Theoretical Underpinnings & Efficient Implementation

Solving the enigma: Deriving optimal explanations of deep networks

A Theoretical Framework for AI Models Explainability with Application in Biomedicine

Physics-Inspired Interpretability Of Machine Learning Models

Interpretable and explainable machine learning: A methods‐centric overview with concrete examples

Helpful, Misleading or Confusing: How Humans Perceive Fundamental Building Blocks of Artificial Intelligence Explanations

Explaining Explanations in AI

What is Interpretability?