A Functional Data Perspective and Baseline On Multi-Layer Out-of-Distribution Detection

Eduardo Dadalto,Pierre Colombo,Guillaume Staerman,Nathan Noiry,Pablo Piantanida
2023-06-06
Abstract:A key feature of out-of-distribution (OOD) detection is to exploit a trained neural network by extracting statistical patterns and relationships through the multi-layer classifier to detect shifts in the expected input data distribution. Despite achieving solid results, several state-of-the-art methods rely on the penultimate or last layer outputs only, leaving behind valuable information for OOD detection. Methods that explore the multiple layers either require a special architecture or a supervised objective to do so. This work adopts an original approach based on a functional view of the network that exploits the sample's trajectories through the various layers and their statistical dependencies. It goes beyond multivariate features aggregation and introduces a baseline rooted in functional anomaly detection. In this new framework, OOD detection translates into detecting samples whose trajectories differ from the typical behavior characterized by the training set. We validate our method and empirically demonstrate its effectiveness in OOD detection compared to strong state-of-the-art baselines on computer vision benchmarks.
Machine Learning,Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
### Problems the paper attempts to solve This paper aims to solve the **Out - of - Distribution (OOD) problem in multi - layer neural networks**. Specifically, most of the existing OOD detection methods mainly rely on the output of the second last or the last layer of the neural network, ignoring the valuable information provided by other layers. In addition, some methods require special network architectures or supervised targets to utilize multi - layer information, which may not be feasible in practical applications. This paper proposes a new perspective. Through the method of Functional Data, the transfer trajectory of samples in the multi - layer neural network is regarded as a function curve, so as to capture the statistical dependence relationship between input samples at each layer. This method can not only make better use of multi - layer information, but also perform effective OOD detection without using additional OOD data. ### Main contributions 1. **Calculate OOD scores from trajectories**: - A method for mapping semantic information from multiple embedding spaces to piecewise linear functions is proposed. - Through the simple inner product between the test sample trajectory and the training prototype trajectory, it indicates the probability that the sample belongs to the in - distribution. 2. **Extensive empirical evaluation**: - On the CIFAR - 10 and ImageNet datasets, five different neural network architectures were verified. The results show that this method is superior to 12 powerful existing methods in terms of average TNR (True Negative Rate at 95% TPR) and AUROC (Area Under the Receiver Operating Characteristic Curve). ### Method overview 1. **Functional representation**: - Reduce the multi - variable hidden representation to a scalar value and calculate the class - conditional training population prototype. - Map the features of each layer to scalar scores through probability - weighted projection to form a functional representation. 2. **Test - time OOD score calculation**: - In the inference stage, rescale the trajectory of the test sample and calculate its similarity score with the training reference trajectory. - By setting a threshold, construct the final binary classification decision function. ### Experimental results - **ImageNet benchmark**: - On ResNet - 50, compared with the ReAct method, TNR is increased by 6.7% and ROC is increased by 1.5%. - On BiT - S - 101, compared with the GradNorm method, TNR is increased by 18.9% and ROC is increased by 5.4%. - On DenseNet - 121, compared with the ReAct method, TNR is increased by 16% and ROC is increased by 3.9%. - On MobileNet - V3 Large, TNR is increased by about 20% and ROC is increased by 9.2%. - **CIFAR - 10 benchmark**: - Using the ResNet - 18 model, on multiple OOD datasets, the average ROC of this method is 2.4% higher than that of existing methods. ### Conclusion The OOD detection method from the perspective of functional data proposed in this paper effectively utilizes multi - layer information by capturing the transfer trajectory of samples in the multi - layer neural network, and improves the performance of OOD detection. This method performs well on multiple benchmark datasets and different network architectures, and has broad application prospects.