Abstract:We propose a novel, succinct, and effective approach for distribution prediction to quantify uncertainty in machine learning. It incorporates adaptively flexible distribution prediction of $\mathbb{P}(\mathbf{y}|\mathbf{X}=x)$ in regression tasks. This conditional distribution's quantiles of probability levels spreading the interval $(0,1)$ are boosted by additive models which are designed by us with intuitions and interpretability. We seek an adaptive balance between the structural integrity and the flexibility for $\mathbb{P}(\mathbf{y}|\mathbf{X}=x)$, while Gaussian assumption results in a lack of flexibility for real data and highly flexible approaches (e.g., estimating the quantiles separately without a distribution structure) inevitably have drawbacks and may not lead to good generalization. This ensemble multi-quantiles approach called EMQ proposed by us is totally data-driven, and can gradually depart from Gaussian and discover the optimal conditional distribution in the boosting. On extensive regression tasks from UCI datasets, we show that EMQ achieves state-of-the-art performance comparing to many recent uncertainty quantification methods. Visualization results further illustrate the necessity and the merits of such an ensemble model.

What problem does this paper attempt to address?

This paper attempts to solve the problem of uncertainty quantification in regression tasks in machine learning. Specifically, it aims to improve the prediction of the conditional distribution $P(y|X = x)$, thereby quantifying the uncertainty in model predictions more accurately. Traditional Gaussian - assumption methods lack flexibility when dealing with real - data, while highly flexible methods (such as estimating quantiles independently without relying on the distribution structure) may lead to insufficient generalization ability. Therefore, this paper proposes a new Ensemble Multi - Quantiles (EMQ) method to predict the conditional distribution in an adaptive and flexible manner and demonstrates its superior performance on multiple datasets. ### Core problems of the paper 1. **Importance of uncertainty quantification**: - Although deep - learning models have achieved state - of - the - art performance in many tasks, their estimates of uncertainty are often over - confident, which may lead to high - risk decisions in practical applications. - Accurate uncertainty estimation can help the model transfer the decision - making power to human experts when the uncertainty is high, or transfer the control to human operators in scenarios such as autonomous driving. 2. **Limitations of existing methods**: - Methods that assume a Gaussian distribution lack flexibility and cannot capture multimodality, asymmetry, and heavy - tailedness in real - data. - Highly flexible methods (such as non - parametric methods) may lead to over - fitting and produce density functions that are difficult to interpret. 3. **Solution proposed in the paper**: - A new Ensemble Multi - Quantiles (EMQ) method is proposed to predict the conditional distribution $P(y|X = x)$ by adaptively balancing the distribution structure and flexibility. - This method starts from a Gaussian distribution and gradually adjusts the quantile prediction to better adapt to the distribution characteristics of real - data. - By introducing an adaptive T - strategy, the number of integration steps is determined dynamically, thereby finding the optimal balance between the distribution structure and flexibility. ### Main contributions 1. **Novel ensemble - learning method**: - A concise and effective method is proposed to predict the conditional distribution by adaptively balancing the distribution structure (such as Gaussian distribution) and flexibility. 2. **Naturally overcome the quantile - crossing problem**: - Without additional effort (such as constrained optimization or post - processing), the EMQ method naturally solves the quantile - crossing problem in multi - quantile estimation. 3. **Superior experimental performance**: - Experimental results on multiple datasets show that the EMQ method outperforms many existing uncertainty quantification methods in terms of calibration and sharpness, including methods based on Gaussian assumptions, Bayesian methods, quantile regression, and traditional tree models. 4. **Adaptive flexibility**: - Experiments verify that the EMQ method can adaptively perform flexible distribution prediction, especially when using the adaptive T - strategy. 5. **Wide applicability**: - This method successfully captures different types of data - distribution characteristics, including peakedness, asymmetry, long - tail, and multimodality, verifying its necessity and advantages in complex real - world data. In conclusion, this paper solves the balance problem between flexibility and distribution structure in existing uncertainty quantification methods by proposing the EMQ method, providing more accurate and reliable uncertainty estimates for regression tasks in machine learning.

Ensemble Multi-Quantiles: Adaptively Flexible Distribution Prediction for Uncertainty Quantification

Distributed Bootstrap Simultaneous Inference for High-Dimensional Quantile Regression

Ensemble Deep Learning-Based Non-Crossing Quantile Regression for Nonparametric Probabilistic Forecasting of Wind Power Generation

Regression via Arbitrary Quantile Modeling

Uncertainty Voting Ensemble for Imbalanced Deep Regression

Deep Ensembles Meets Quantile Regression: Uncertainty-aware Imputation for Time Series

Towards reliable uncertainty quantification via deep ensemble in multi-output regression task

A Distribution-Free Method for Probabilistic Prediction

Heat Equation Stein Variational Ensemble: Rethinking and Advancing Uncertainty-Aware Soft Sensor Modeling

Quantile Extreme Gradient Boosting for Uncertainty Quantification

Easy Uncertainty Quantification (EasyUQ): Generating Predictive Distributions from Single-Valued Model Output

Towards Reliable Uncertainty Quantification via Deep Ensembles in Multi-output Regression Task

Logit-Based Ensemble Distribution Distillation for Robust Autoregressive Sequence Uncertainties

Deviation Entropy-Based Dynamic Multi-Model Ensemble Interval Prediction Method for Quantifying Uncertainty of Building Cooling Load

Robust and scalable uncertainty estimation with conformal prediction for machine-learned interatomic potentials

Online Prediction of Extreme Conditional Quantiles via B-Spline Interpolation

Uncertainty in Gradient Boosting via Ensembles

Remaining Useful Life Prediction with Uncertainty Quantification Based on Multi-Distribution Fusion Structure

Conformalized-DeepONet: A Distribution-Free Framework for Uncertainty Quantification in Deep Operator Networks

Forecasting of Landslide Displacement Using a Probability-Scheme Combination Ensemble Prediction Technique