DeepUQ: Assessing the Aleatoric Uncertainties from two Deep Learning Methods

Rebecca Nevin,Aleksandra Ćiprijanović,Brian D. Nord
2024-11-13
Abstract:Assessing the quality of aleatoric uncertainty estimates from uncertainty quantification (UQ) deep learning methods is important in scientific contexts, where uncertainty is physically meaningful and important to characterize and interpret exactly. We systematically compare aleatoric uncertainty measured by two UQ techniques, Deep Ensembles (DE) and Deep Evidential Regression (DER). Our method focuses on both zero-dimensional (0D) and two-dimensional (2D) data, to explore how the UQ methods function for different data dimensionalities. We investigate uncertainty injected on the input and output variables and include a method to propagate uncertainty in the case of input uncertainty so that we can compare the predicted aleatoric uncertainty to the known values. We experiment with three levels of noise. The aleatoric uncertainty predicted across all models and experiments scales with the injected noise level. However, the predicted uncertainty is miscalibrated to $\rm{std}(\sigma_{\rm al})$ with the true uncertainty for half of the DE experiments and almost all of the DER experiments. The predicted uncertainty is the least accurate for both UQ methods for the 2D input uncertainty experiment and the high-noise level. While these results do not apply to more complex data, they highlight that further research on post-facto calibration for these methods would be beneficial, particularly for high-noise and high-dimensional settings.
Machine Learning,Artificial Intelligence
What problem does this paper attempt to address?
### What problem does this paper attempt to solve? This paper aims to evaluate the performance of two deep - learning uncertainty quantification (UQ) methods in estimating aleatoric uncertainty, specifically **Deep Ensembles (DE)** and **Deep Evidential Regression (DER)**. By systematically comparing the performance of these two methods under different data dimensions (0D and 2D), the author explores their effectiveness in handling uncertainties in input and output variables and verifies whether the predicted aleatoric uncertainty is consistent with the known true uncertainty. #### Main research questions include: 1. **Evaluating the accuracy of aleatoric uncertainty**: Investigate whether the aleatoric uncertainty predicted by the two UQ methods changes proportionally with the injected noise level under different noise levels. 2. **Calibration problem**: Check whether the predicted aleatoric uncertainty is within the calibration range (i.e., within the standard deviation of the true uncertainty value), especially in the case of high - noise and high - dimensional data. 3. **The influence of different data dimensions**: Explore the performance differences of these methods when dealing with data of different dimensions (such as zero - dimensional tabular data and two - dimensional image data). 4. **Input uncertainty propagation**: Study how to propagate the uncertainty of input variables to output variables and evaluate whether the predicted aleatoric uncertainty is consistent with the true uncertainty calculated through error propagation. #### Research background: - **Aleatoric Uncertainty** refers to the uncertainty inherent in the data itself, which is different from epistemic uncertainty. It is caused by the randomness of the data itself. - In the scientific field, especially in astronomy and physics, accurate assessment of aleatoric uncertainty is crucial because it helps to understand physical processes and conduct precise experimental design and data analysis. #### Method overview: - **Experimental design**: The author used four experimental setups, injecting different levels of noise (low, medium, high) into 0D and 2D data respectively, and introducing uncertainties in input and output variables. - **Evaluation criteria**: Three evaluation criteria (desiderata) were set: 1. The predicted uncertainty should increase as the injected uncertainty increases. 2. The aleatoric uncertainty should be within the calibration range (i.e., within the standard deviation of the true uncertainty value). 3. These criteria should hold for all data dimensions and types of uncertainty injection (input and output). #### Conclusions: - **Results**: Although both methods can meet the first evaluation criterion (the predicted uncertainty increases as the injected uncertainty increases), they are deficient in calibration. Especially for high - noise and high - dimensional data, the predicted uncertainty is often inaccurate. - **Future research directions**: The author suggests further research on the post - facto calibration of these methods, especially in the case of high - noise and high - dimensional data, to improve the accuracy of their predictions. Through these studies, the author hopes to provide more reliable aleatoric uncertainty estimation methods for scientific and industrial applications, ensuring that the prediction results of deep - learning models can better meet the actual needs.