Revisiting Energy-Based Model for Out-of-Distribution Detection

Yifan Wu,Xichen Ye,Songmin Dai,Dengye Pan,Xiaoqiang Li,Weizhong Zhang,Yifan Chen
2024-12-04
Abstract:Out-of-distribution (OOD) detection is an essential approach to robustifying deep learning models, enabling them to identify inputs that fall outside of their trained distribution. Existing OOD detection methods usually depend on crafted data, such as specific outlier datasets or elaborate data augmentations. While this is reasonable, the frequent mismatch between crafted data and OOD data limits model robustness and generalizability. In response to this issue, we introduce Outlier Exposure by Simple Transformations (OEST), a framework that enhances OOD detection by leveraging "peripheral-distribution" (PD) data. Specifically, PD data are samples generated through simple data transformations, thus providing an efficient alternative to manually curated outliers. We adopt energy-based models (EBMs) to study PD data. We recognize the "energy barrier" in OOD detection, which characterizes the energy difference between in-distribution (ID) and OOD samples and eases detection. PD data are introduced to establish the energy barrier during training. Furthermore, this energy barrier concept motivates a theoretically grounded energy-barrier loss to replace the classical energy-bounded loss, leading to an improved paradigm, OEST*, which achieves a more effective and theoretically sound separation between ID and OOD samples. We perform empirical validation of our proposal, and extensive experiments across various benchmarks demonstrate that OEST* achieves better or similar accuracy compared with state-of-the-art methods.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: **How to improve the ability of deep - learning models to detect out - of - distribution (OOD) samples in open - world scenarios**. Specifically, existing OOD detection methods usually rely on carefully designed external data (such as specific anomaly datasets or complex data augmentation), but the mismatch between these external data and actual OOD data limits the robustness and generalization ability of the model. To solve this problem, the author proposes a new framework - **Outlier Exposure by Simple Transformations (OEST)**, and introduces the concept of "Peripheral - Distribution (PD)" data. PD data are samples generated by simple transformations of the training data. They are neither completely in - distribution (ID) samples nor typical OOD samples, but an interpolation between the two. In addition, the author re - examines Energy - Based Models (EBM) and enhances the effect of OOD detection by introducing the concept of "energy barrier". Specifically, they propose a theoretically - based energy barrier loss function to replace the traditional energy - bounded loss, so as to better separate ID and OOD samples. ### Main contributions: 1. **Introducing Peripheral - Distribution (PD) data**: PD data generated by simple transformations are used to enhance OOD detection. 2. **Re - examining Energy - Based Models (EBM)**: Proposing the concept of energy polarization, that is, encouraging ID samples to have lower energy values and OOD samples to have higher energy values. 3. **Establishing an energy barrier**: Establish an energy barrier between ID and PD data, so as to effectively distinguish ID and OOD samples. 4. **Proposing a targeted optimization strategy**: Based on the establishment of the energy barrier, optimize the existing classifiers and achieve significant performance improvements in approximate OOD and far - end OOD detection tasks. ### Formula summary: - **Energy barrier hypothesis** (Formula 9): \[ E(x^+; f)-E(x; f)>B\|x' - x^+\|+\gamma_\alpha \] where \(x\) is an ID sample, \(x^+\) is a PD sample, \(x'\) is an OOD sample, \(\gamma_\alpha\geq0\) is a constant, \(B\) is the radius of the domain, and \(\alpha\in(0, 1)\) is the probability level. - **Energy barrier theorem** (Formula 11): \[ E(x'; f)-E(x; f)>\gamma_\alpha \] This means that the energy value of the OOD sample \(x'\) will be higher than that of the random ID sample \(x\) with a probability of \(1-\alpha\). Through these methods, the OEST* framework not only improves the accuracy of OOD detection, but also shows superior performance in multiple benchmark tests.