Abstract:Predictions and forecasts of machine learning models should take the form of probability distributions, aiming to increase the quantity of information communicated to end users. Although applications of probabilistic prediction and forecasting with machine learning models in academia and industry are becoming more frequent, related concepts and methods have not been formalized and structured under a holistic view of the entire field. Here, we review the topic of predictive uncertainty estimation with machine learning algorithms, as well as the related metrics (consistent scoring functions and proper scoring rules) for assessing probabilistic predictions. The review covers a time period spanning from the introduction of early statistical (linear regression and time series models, based on Bayesian statistics or quantile regression) to recent machine learning algorithms (including generalized additive models for location, scale and shape, random forests, boosting and deep learning algorithms) that are more flexible by nature. The review of the progress in the field, expedites our understanding on how to develop new algorithms tailored to users' needs, since the latest advancements are based on some fundamental concepts applied to more complex algorithms. We conclude by classifying the material and discussing challenges that are becoming a hot topic of research.

What problem does this paper attempt to address?

The paper primarily explores the research progress and methods of machine learning models in predicting uncertainty estimation. Specifically, the paper aims to address the following key issues: 1. **Enhancing Predictive Information Content**: Traditionally, machine learning models often provide single-point predictions. While these predictions are intuitive, they offer limited information. To better reflect the uncertainty of prediction results, the paper advocates presenting prediction results in the form of probability distributions. This approach can provide users with richer information, helping them make better decisions. 2. **Reviewing Methods for Predictive Uncertainty Estimation**: Although probabilistic prediction and forecasting based on machine learning are increasingly used in academia and industry, the related concepts and methods have not been systematically organized and summarized. Therefore, the paper provides a comprehensive review of methods for predictive uncertainty estimation and discusses relevant metrics for evaluating probabilistic predictions (such as proper scoring functions and appropriate scoring rules). 3. **Covering Algorithms from Simple to Complex**: The paper not only covers early statistical methods (such as linear regression and time series models) but also introduces more modern and flexible machine learning algorithms (including generalized additive models, random forests, gradient boosting, and deep learning). The development of these algorithms helps us understand how to customize new algorithms to meet specific user needs. 4. **Integrating Research Results from Different Fields**: The paper also discusses how to combine seemingly unrelated features to form new probabilistic prediction algorithms and explores some special application scenarios, such as technical combinations, time series forecasting, spatial prediction, extreme event prediction, and measurement errors. In summary, this paper attempts to address the issue of how to effectively use machine learning models for predictive uncertainty estimation by comprehensively reviewing existing theories and technical means. On this basis, it aims to promote the development of new algorithms to meet the needs of different users.

A review of predictive uncertainty estimation with machine learning

How to evaluate uncertainty estimates in machine learning for regression?

A review of machine learning concepts and methods for addressing challenges in probabilistic hydrological post-processing and forecasting

Introducing an Improved Information-Theoretic Measure of Predictive Uncertainty

A comparative study of conformal prediction methods for valid uncertainty quantification in machine learning

Quantifying the Prediction Uncertainty of Machine Learning Models for Individual Data

On Information-Theoretic Measures of Predictive Uncertainty

Probabilistic Machine Learning for Healthcare

A Structured Review of Literature on Uncertainty in Machine Learning & Deep Learning

Uncertainty-aware Evaluation of Machine Learning Performance in binary Classification Tasks

Uncertainty as a Fairness Measure

Predictive Multiplicity in Probabilistic Classification

Portfolio Optimization Strategies: New Approaches Based on Machine Learning Forecasting

Uncertainty estimation of machine learning spatial precipitation predictions from satellite data

Assessing predictability of environmental time series with statistical and machine learning models

Uncertainty-based Fairness Measures

Predicting weather forecast uncertainty with machine learning

Uncertainty Modelling in Deep Networks: Forecasting Short and Noisy Series

Navigating Uncertainties in Machine Learning for Structural Dynamics: A Comprehensive Review of Probabilistic and Non-Probabilistic Approaches in Forward and Inverse Problems

Empirical Frequentist Coverage of Deep Learning Uncertainty Quantification Procedures

Making Early Predictions of the Accuracy of Machine Learning Applications