Abstract:The presence of automated decision making continuously increases in today's society. Algorithms based in machine and deep learning decide how much we pay for insurance,&#160; translate our thoughts to speech, and shape our consumption of goods (via e-marketing) and knowledge (via search engines). Machine and deep learning models are ubiquitous in science too, in particular, many promising examples are being developed to prove their feasibility for earth sciences applications, like finding temporal trends or spatial patterns in data or improving parameterization schemes for climate simulations.&#160;However, most machine and deep learning applications aim to optimise performance metrics (for instance, accuracy, which stands for the times the model prediction was right), which are rarely good indicators of trust (i.e., why these predictions were right?). In fact, with the increase of data volume and model complexity, machine learning and deep learning&#160; predictions can be very accurate but also prone to rely on spurious correlations, encode and magnify bias, and draw conclusions that do not incorporate the underlying dynamics governing the system. Because of that, the uncertainty of the predictions and our confidence in the model are difficult to estimate and the relation between inputs and outputs becomes hard to interpret.&#160;Since it is challenging to shift a community from &#8220;black&#8221; to &#8220;glass&#8221; boxes, it is more useful to implement Explainable Artificial Intelligence (XAI) techniques right at the beginning of the machine learning and deep learning adoption rather than trying to fix fundamental problems later. The good news is that most of the popular XAI techniques basically are sensitivity analyses because they consist of a systematic perturbation of some model components in order to observe how it affects the model predictions. The techniques comprise random sampling, Monte-Carlo simulations, and ensemble runs, which are common methods in geosciences. Moreover, many XAI techniques are reusable because they are model-agnostic and must be applied after the model has been fitted. In addition, interpretability provides robust arguments when communicating machine and deep learning predictions to scientists and decision-makers.In order to assist not only the practitioners but also the end-users in the evaluation of&#160; machine and deep learning results, we will explain the intuition behind some popular techniques of XAI and aleatory and epistemic Uncertainty Quantification: (1) the Permutation Importance and Gaussian processes on the inputs (i.e., the perturbation of the model inputs), (2) the Monte-Carlo Dropout, Deep ensembles, Quantile Regression, and Gaussian processes on the weights (i.e, the perturbation of the model architecture), (3) the Conformal Predictors (useful to estimate the confidence interval on the outputs), and (4) the Layerwise Relevance Propagation (LRP), Shapley values, and Local Interpretable Model-Agnostic Explanations (LIME) (designed to visualize how each feature in the data affected a particular prediction). We will also introduce some best-practises, like the detection of anomalies in the training data before the training, the implementation of fallbacks when the prediction is not reliable, and physics-guided learning by including constraints in the loss function to avoid physical inconsistencies, like the violation of conservation laws.&#160;

Using AI Uncertainty Quantification to Improve Human Decision-Making

Uncertainty-Based Rejection in Machine Learning: Implications for Model Development and Interpretability

Uncertainty Quantification 360: A Holistic Toolkit for Quantifying and Communicating the Uncertainty of AI

Interpretable Uncertainty Quantification in AI for HEP

A Decision-driven Methodology for Designing Uncertainty-aware AI Self-Assessment

Uncertainty in XAI: Human Perception and Modeling Approaches

Evaluation of Uncertainty Quantification in Deep Learning

Evaluating and Boosting Uncertainty Quantification in Classification

Uncertainty Quantification and Explainable Artificial Intelligence

Human Uncertainty in Concept-Based AI Systems

Uncertainty Quantification in Machine Learning for Engineering Design and Health Prognostics: A Tutorial

A review of uncertainty quantification in deep learning: Techniques, applications and challenges

Online Algorithms with Uncertainty-Quantified Predictions

Understanding the Role of Human Intuition on Reliance in Human-AI Decision-Making with Explanations

A Survey on Uncertainty Quantification Methods for Deep Learning

Reconciling Irrational Human Behavior with AI based Decision Making: A Quantum Probabilistic Approach

Designing for Appropriate Reliance: The Roles of AI Uncertainty Presentation, Initial User Decision, and User Demographics in AI-Assisted Decision-Making

Decision-Focused Uncertainty Quantification

Effect of confidence and explanation on accuracy and trust calibration in AI-assisted decision making

Reasoning with an uncertainty of information measure: decision making for military and non-military applications

Explainability through uncertainty: Trustworthy decision-making with neural networks