Is RobustBench/AutoAttack a suitable Benchmark for Adversarial Robustness?

Peter Lorenz,Dominik Strassel,Margret Keuper,Janis Keuper

2024-02-20

Abstract:Recently, RobustBench (Croce et al. 2020) has become a widely recognized benchmark for the adversarial robustness of image classification networks. In its most commonly reported sub-task, RobustBench evaluates and ranks the adversarial robustness of trained neural networks on CIFAR10 under AutoAttack (Croce and Hein 2020b) with l-inf perturbations limited to eps = 8/255. With leading scores of the currently best performing models of around 60% of the baseline, it is fair to characterize this benchmark to be quite challenging. Despite its general acceptance in recent literature, we aim to foster discussion about the suitability of RobustBench as a key indicator for robustness which could be generalized to practical applications. Our line of argumentation against this is two-fold and supported by excessive experiments presented in this paper: We argue that I) the alternation of data by AutoAttack with l-inf, eps = 8/255 is unrealistically strong, resulting in close to perfect detection rates of adversarial samples even by simple detection algorithms and human observers. We also show that other attack methods are much harder to detect while achieving similar success rates. II) That results on low-resolution data sets like CIFAR10 do not generalize well to higher resolution images as gradient-based attacks appear to become even more detectable with increasing resolutions.

Computer Vision and Pattern Recognition,Cryptography and Security

What problem does this paper attempt to address?

This paper attempts to explore and question whether RobustBench and its AutoAttack framework, which is used by default, are suitable as benchmarks for evaluating the adversarial robustness of image classification models. Specifically, the author presents two main arguments: 1. **Excessive perturbation intensity of AutoAttack**: The author believes that AutoAttack performs unrealistically strong perturbations on data when using the \( l_{\infty} \) norm and \(\epsilon = 8/255\), resulting in the fact that even simple detection algorithms can detect adversarial samples almost perfectly. This makes successful attacks in practical applications very difficult. 2. **Results on low - resolution datasets are difficult to generalize to high - resolution images**: The author points out that the results on low - resolution datasets (such as CIFAR - 10) cannot be directly generalized to applications of high - resolution images. As the image resolution increases, gradient - based attacks become more easily detectable. Through experiments, the author shows the performance of adversarial samples generated by AutoAttack under different datasets and different detection methods, further supporting their views. These experimental results indicate that the effectiveness of AutoAttack on high - resolution images decreases significantly, and other attack methods are more difficult to detect than AutoAttack in some cases. In summary, the main purpose of this paper is to initiate a discussion about the applicability of RobustBench and its AutoAttack framework as adversarial robustness benchmarks, and to propose the need for more realistic and comprehensive evaluation methods to measure the robustness of models in practical applications.

Is RobustBench/AutoAttack a suitable Benchmark for Adversarial Robustness?

Attack As Defense: Characterizing Adversarial Examples Using Robustness.

Benchmarking Adversarial Robustness on Image Classification

ROBY: Evaluating the adversarial robustness of a deep model by its decision boundaries

A Comprehensive Study on Robustness of Image Classification Models: Benchmarking and Rethinking

Adversarial Robustness Limits via Scaling-Law and Human-Alignment Studies

Testing Robustness Against Unforeseen Adversaries

Practical Evaluation of Adversarial Robustness Via Adaptive Auto Attack.

MultiRobustBench: Benchmarking Robustness Against Multiple Attacks

On the Robustness of Adversarial Training Against Uncertainty Attacks

Benchmarking Neural Network Robustness to Common Corruptions and Perturbations

Not So Robust After All: Evaluating the Robustness of Deep Neural Networks to Unseen Adversarial Attacks

Are Adversarial Robustness and Common Perturbation Robustness Independent Attributes ?

Delving into the Adversarial Robustness on Face Recognition

Intriguing Properties of Robust Classification

RobustART: Benchmarking Robustness on Architecture Design and Training Techniques

AttackBench: Evaluating Gradient-based Attacks for Adversarial Examples

Demystifying the Adversarial Robustness of Random Transformation Defenses

Scaling Compute Is Not All You Need for Adversarial Robustness

ROBY: Evaluating the Robustness of a Deep Model by its Decision Boundaries

Towards the first adversarially robust neural network model on MNIST