Robust outlier detection by de-biasing VAE likelihoods

Kushal Chauhan,Barath Mohan U,Pradeep Shenoy,Manish Gupta,Devarajan Sridharan
DOI: https://doi.org/10.48550/arXiv.2108.08760
2022-07-19
Abstract:Deep networks often make confident, yet, incorrect, predictions when tested with outlier data that is far removed from their training distributions. Likelihoods computed by deep generative models (DGMs) are a candidate metric for outlier detection with unlabeled data. Yet, previous studies have shown that DGM likelihoods are unreliable and can be easily biased by simple transformations to input data. Here, we examine outlier detection with variational autoencoders (VAEs), among the simplest of DGMs. We propose novel analytical and algorithmic approaches to ameliorate key biases with VAE likelihoods. Our bias corrections are sample-specific, computationally inexpensive, and readily computed for various decoder visible distributions. Next, we show that a well-known image pre-processing technique -- contrast stretching -- extends the effectiveness of bias correction to further improve outlier detection. Our approach achieves state-of-the-art accuracies with nine grayscale and natural image datasets, and demonstrates significant advantages -- both with speed and performance -- over four recent, competing approaches. In summary, lightweight remedies suffice to achieve robust outlier detection with VAEs.
Machine Learning,Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: **the reliability problem of deep generative models (such as variational auto - encoders, VAE) in anomaly detection**. Specifically, when using unlabeled data for anomaly detection, the likelihood values calculated by VAE are easily affected by low - level features of the input data (such as pixel intensity and contrast), resulting in poor anomaly detection performance. Therefore, the paper proposes a de - biasing method to improve the robustness and accuracy of VAE in anomaly detection. ### Problem Background 1. **Over - confident Prediction Problem of Deep Neural Networks**: - Deep neural networks tend to make over - confident but incorrect predictions when dealing with images that are significantly different from the training distribution. - This phenomenon is a key issue in practical applications, especially in the field of computer vision. 2. **Unreliability of VAE Likelihood Values**: - As a deep generative model, the likelihood values calculated by VAE are unstable in anomaly detection and are easily affected by low - level features of the input data (such as pixel intensity and contrast). - For example, VAE assigns higher likelihood values to samples with a large number of black pixels and lower likelihood values to samples with intermediate gray - scale values. ### Main Contributions of the Paper 1. **Analyze and Correct the Bias of VAE Likelihood Values**: - Through theoretical analysis and algorithm design, the paper proposes two de - biasing methods: analytical correction and algorithmic correction. - Analytical correction targets the continuous Bernoulli visible distribution and eliminates the bias by adjusting the reconstruction error term. - Algorithmic correction is applicable to other types of visible distributions (such as categorical distributions) and is corrected by calculating the average distribution of each pixel value from the training data. 2. **Introduce a Contrast - stretching Pre - processing Step**: - Contrast - stretching is a standard image pre - processing technique that can effectively reduce the impact of contrast changes on VAE likelihood values, thereby further improving the performance of anomaly detection. 3. **Extensive Experimental Verification**: - The paper conducts experiments on multiple grayscale image and natural image datasets, demonstrating the effectiveness of the de - biasing methods. - The experimental results show that the de - biased VAE likelihood values achieve state - of - the - art performance in the anomaly detection task and have a significant speed advantage. ### Conclusion Through the above methods, the paper successfully solves the problem of likelihood value bias in VAE caused by low - level features in anomaly detection, making VAE more robust and efficient in the anomaly detection task.