Luis A. Ortega,Simón Rodríguez-Santana,Daniel Hernández-Lobato
Abstract:Recently, there has been an increasing interest in performing post-hoc uncertainty estimation about the predictions of pre-trained deep neural networks (DNNs). Given a pre-trained DNN via back-propagation, these methods enhance the original network by adding output confidence measures, such as error bars, without compromising its initial accuracy. In this context, we introduce a novel family of sparse variational Gaussian processes (GPs), where the posterior mean is fixed to any continuous function when using a universal kernel. Specifically, we fix the mean of this GP to the output of the pre-trained DNN, allowing our approach to effectively fit the GP's predictive variances to estimate the DNN prediction uncertainty. Our approach leverages variational inference (VI) for efficient stochastic optimization, with training costs that remain independent of the number of training points, scaling efficiently to large datasets such as ImageNet. The proposed method, called fixed mean GP (FMGP), is architecture-agnostic, relying solely on the pre-trained model's outputs to adjust the predictive variances. Experimental results demonstrate that FMGP improves both uncertainty estimation and computational efficiency when compared to state-of-the-art methods.
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is: how to provide reliable uncertainty estimates for pre - trained deep neural networks (DNNs) without affecting their initial prediction performance. Specifically, the authors propose a new method - Fixed - Mean Gaussian Processes (FMGP) - to add confidence measures (such as error bars) to the prediction results of DNNs without retraining or fine - tuning them, thereby improving the reliability of the model in risk - sensitive scenarios.
### Background of the Main Problem
In recent years, deep neural networks (DNNs) have achieved remarkable success in various pattern recognition tasks, but they have limitations in certain application scenarios for the following reasons:
1. **Poor calibration of probability prediction**: DNNs tend to produce over - confident predictions, especially in areas where the training data is insufficiently covered.
2. **Lack of reasoning ability**: DNNs perform poorly in scenarios where model uncertainty is required, such as in the fields of autonomous driving and medicine.
To address these issues, Bayesian neural networks (BNNs) and other posterior inference methods have been proposed, but these methods usually involve high - dimensional, multi - modal posterior parameter distributions and have high computational complexity, making them difficult to apply to large - scale practical problems.
### The Solution Proposed in the Paper
The authors introduce Fixed - Mean Gaussian Processes (FMGP), a new family of sparse variational Gaussian processes. The core idea of FMGP is to effectively fit the prediction variance and estimate the uncertainty of DNN predictions by fixing the posterior mean of the Gaussian process to the output of the pre - trained DNN using a general kernel function. The specific steps are as follows:
1. **Fixed mean**: By selecting appropriate inducing points and kernel function parameters, make the posterior mean of the Gaussian process match the output of the pre - trained DNN.
2. **Variational inference (VI)**: Use variational inference to optimize the prediction variance of the Gaussian process, ensuring that the computational cost is independent of the number of training samples and is suitable for large - scale datasets (such as ImageNet).
3. **Architecture - independence**: FMGP only depends on the output of the pre - trained model to adjust the prediction variance, so it has no requirements for the specific architecture of the DNN.
### Advantages of the Method
- **Scalability**: FMGP avoids calculating the Jacobian matrix of the DNN, reducing the computational complexity and is suitable for large - scale neural networks.
- **High - performance retention**: Since the posterior mean is fixed to the output of the pre - trained DNN, FMGP can maintain the original prediction performance.
- **Improved uncertainty estimation**: Compared with other posterior methods, FMGP provides more accurate uncertainty estimates.
In conclusion, this paper aims to provide an efficient and reliable uncertainty estimation method for pre - trained DNNs by introducing FMGP, thereby enhancing their reliability and performance in practical applications.