Uncertainty Quantification in Deep Neural Networks through Statistical Inference on Latent Space

Luigi Sbailò,Luca M. Ghiringhelli
2023-05-18
Abstract:Uncertainty-quantification methods are applied to estimate the confidence of deep-neural-networks classifiers over their predictions. However, most widely used methods are known to be overconfident. We address this problem by developing an algorithm that exploits the latent-space representation of data points fed into the network, to assess the accuracy of their prediction. Using the latent-space representation generated by the fraction of training set that the network classifies correctly, we build a statistical model that is able to capture the likelihood of a given prediction. We show on a synthetic dataset that commonly used methods are mostly overconfident. Overconfidence occurs also for predictions made on data points that are outside the distribution that generated the training data. In contrast, our method can detect such out-of-distribution data points as inaccurately predicted, thus aiding in the automatic detection of outliers.
Machine Learning,Artificial Intelligence
What problem does this paper attempt to address?
This paper attempts to address the problem of accurately quantifying prediction uncertainty in deep neural network classification tasks, especially identifying those "out-of-distribution" data points. Most existing uncertainty quantification methods suffer from overconfidence when evaluating prediction confidence, meaning they are too confident in their predictions, particularly when dealing with "out-of-distribution" data points. This can lead to the model failing to correctly recognize the inaccuracy of its predictions when encountering unknown or anomalous data. Specifically, the authors propose a method based on latent space representation to evaluate prediction uncertainty. By analyzing the representation of correctly classified data points in the latent space from the training set, a statistical model is constructed to more accurately estimate prediction confidence. This method can not only detect classification errors of "in-distribution" data points but also effectively identify "out-of-distribution" data points, thereby improving the model's robustness on unknown data. ### Main Contributions of the Paper: 1. **Proposed a new uncertainty quantification method**: Using latent space representation to evaluate prediction confidence, which can more accurately identify "out-of-distribution" data points. 2. **Addressed the overconfidence issue of existing methods**: By constructing a statistical model, this method can more reliably assess prediction uncertainty when dealing with unknown data. 3. **Experimental validation of the method's effectiveness**: Experiments conducted on the MNIST dataset show that this method outperforms existing MC-dropout and ensemble methods in detecting "out-of-distribution" samples. ### Background and Motivation: - **Importance of uncertainty quantification**: In high-risk applications such as medical image diagnosis and autonomous driving, accurately assessing model uncertainty is crucial. If the model cannot correctly recognize the inaccuracy of its predictions when encountering unknown data, it may lead to severe consequences. - **Limitations of existing methods**: Most existing uncertainty quantification methods (such as MC-dropout and ensemble methods) tend to be overconfident when dealing with "out-of-distribution" data, limiting their reliability in practical applications. ### Method Overview: - **Latent space representation**: Input data is transformed into latent space representation through forward propagation and grouped according to the network's predicted labels. - **Statistical model construction**: For each class and each hidden layer, a multivariate Gaussian distribution is constructed using the latent representations of correctly classified data points. - **Confidence evaluation**: The likelihood of the input data point in the latent space is calculated and compared with a preset threshold to evaluate prediction confidence. ### Experimental Results: - **High true positive and true negative rates**: Experimental results show that this method exhibits high accuracy in detecting both "in-distribution" and "out-of-distribution" data points. - **Superior to existing methods**: Compared to MC-dropout and ensemble methods, this method performs better in detecting "out-of-distribution" samples. ### Limitations and Future Work: - **Assumption conditions**: The method assumes that data in the latent space can be approximated by a normal distribution, which may not hold in some complex scenarios. - **Independence assumption**: It assumes that the probability distributions between different hidden layers are independent, which may not be entirely true in deep networks. - **Scalability**: Future work could apply this method to continuous value prediction tasks and explore more application scenarios. In summary, this paper proposes a new uncertainty quantification method that evaluates prediction confidence by analyzing latent space representation, effectively addressing the overconfidence issue of existing methods and demonstrating excellent performance in experiments.