Enhancing Out-of-Distribution Detection with Multitesting-based Layer-wise Feature Fusion

Jiawei Li,Sitong Li,Shanshan Wang,Yicheng Zeng,Falong Tan,Chuanlong Xie
2024-03-16
Abstract:Deploying machine learning in open environments presents the challenge of encountering diverse test inputs that differ significantly from the training data. These out-of-distribution samples may exhibit shifts in local or global features compared to the training distribution. The machine learning (ML) community has responded with a number of methods aimed at distinguishing anomalous inputs from original training data. However, the majority of previous studies have primarily focused on the output layer or penultimate layer of pre-trained deep neural networks. In this paper, we propose a novel framework, Multitesting-based Layer-wise Out-of-Distribution (OOD) Detection (MLOD), to identify distributional shifts in test samples at different levels of features through rigorous multiple testing procedure. Our approach distinguishes itself from existing methods as it does not require modifying the structure or fine-tuning of the pre-trained classifier. Through extensive experiments, we demonstrate that our proposed framework can seamlessly integrate with any existing distance-based inspection method while efficiently utilizing feature extractors of varying depths. Our scheme effectively enhances the performance of out-of-distribution detection when compared to baseline methods. In particular, MLOD-Fisher achieves superior performance in general. When trained using KNN on CIFAR10, MLOD-Fisher significantly lowers the false positive rate (FPR) from 24.09% to 7.47% on average compared to merely utilizing the features of the last layer.
Machine Learning,Artificial Intelligence,Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The paper primarily addresses a significant issue encountered in machine learning (especially deep learning): how to effectively detect out-of-distribution (OOD) data samples. When deploying machine learning models in open environments, the test data may differ significantly from the training data, and these differences can lead to a substantial decline in model prediction performance or even complete failure. Therefore, accurately identifying which input data belongs to distributions unseen by the model is crucial for ensuring the model's safety and reliability. The paper proposes a new framework called "Multitesting-based Layer-wise Out-of-Distribution Detection" (MLOD). The main innovation of this method lies in its ability to improve the accuracy of OOD detection through comprehensive analysis of multi-layer features, rather than relying solely on the last or penultimate layer of a pre-trained neural network. Specifically, MLOD utilizes multiple hypothesis testing techniques from statistics to integrate feature representations from different layers and effectively control the False Positive Rate (FPR), thereby enhancing overall detection performance. The key contributions of the paper can be summarized as follows: 1. **Proposed a novel OOD detection framework**: This framework identifies distribution differences between test samples and training data by leveraging multi-layer features of deep neural networks and using multiple hypothesis testing techniques. 2. **Comprehensively evaluated the effectiveness of MLOD**: Through theoretical analysis and experimental validation, the advantages of MLOD under various combination testing methods were demonstrated. 3. **Significantly improved the performance of existing OOD detection methods**: Compared to methods that rely solely on the last layer features, MLOD showed better performance on multiple benchmark datasets, especially in reducing the false positive rate. In summary, the paper proposes an effective solution to the critical issue of out-of-distribution data detection and demonstrates its effectiveness through empirical studies.