Deciphering the Definition of Adversarial Robustness for post-hoc OOD Detectors

Peter Lorenz,Mario Fernandez,Jens Müller,Ullrich Köthe
2024-06-29
Abstract:Detecting out-of-distribution (OOD) inputs is critical for safely deploying deep learning models in real-world scenarios. In recent years, many OOD detectors have been developed, and even the benchmarking has been standardized, i.e. OpenOOD. The number of post-hoc detectors is growing fast and showing an option to protect a pre-trained classifier against natural distribution shifts, claiming to be ready for real-world scenarios. However, its efficacy in handling adversarial examples has been neglected in the majority of studies. This paper investigates the adversarial robustness of the 16 post-hoc detectors on several evasion attacks and discuss a roadmap towards adversarial defense in OOD detectors.
Cryptography and Security,Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
### What problems does this paper attempt to solve? This paper aims to explore and solve the following key problems: 1. **Adversarial Robustness of OOD Detectors**: - The paper points out that existing OOD (Out - of - Distribution) detectors perform well in handling natural distribution shifts but poorly when dealing with adversarial examples. Adversarial examples are generated by small but carefully designed perturbations that can mislead deep - learning models. - Researchers evaluated the performance of 16 post - hoc OOD detectors under several common adversarial attacks (such as FGSM, PGD, and DeepFool) and found that these detectors lack robustness against adversarial examples. 2. **Redefining Adversarial Robustness**: - The paper attempts to redefine the adversarial robustness of OOD detectors, especially in post - hoc OOD detection methods. The author proposes a unified robustness definition to ensure that the detector can not only identify natural distribution shifts but also effectively resist adversarial attacks. 3. **Constructing an Adversarial Defense Roadmap**: - To improve the adversarial robustness of OOD detectors, the paper proposes a multi - level roadmap from detection to defense. This includes: - **Level 1**: Use strong attacks for evaluation and avoid hyperparameters that weaken the attack effect. - **Level 2**: Use more complex datasets and models, such as ImageNet - 1K, to simulate real - world scenarios. - **Level 3**: Develop strategies to deal with attacks, considering the situation where new defense mechanisms may be quickly broken. - **Level 4**: Test the performance of methods under complex attacks (such as adaptive attacks) and regard adversarial robustness as an iterative process. 4. **Improvement of the Standardized Benchmark Framework**: - Current benchmark frameworks (such as OpenOOD) mainly focus on natural distribution shifts and ignore adversarial examples. The paper suggests expanding these frameworks to include the evaluation of adversarial attacks in order to more comprehensively measure the performance of OOD detectors. ### Summary The core objective of this paper is to evaluate the performance of existing OOD detectors in the face of adversarial examples and propose improvement plans to enhance their robustness and reliability in real - world applications. By redefining adversarial robustness and providing a multi - level defense roadmap, the author hopes to provide a benchmark for future research and promote the development of OOD detection technology.