Abstract:Deep learning models have shown to be susceptible to universal adversarial perturbation (UAP), which has aroused wide concerns in the community. Compared with the conventional adversarial attacks that generate adversarial samples at the instance level, UAP can fool the target model for different instances with only a single perturbation, enabling us to evaluate the robustness of the model from a more effective and accurate perspective. The existing universal attack methods fail to exploit the differences and connections between the instance and universal levels to produce dominant perturbations. To address this challenge, we propose a new universal attack method that unifies instance-specific and universal attacks from a feature perspective to generate a more dominant UAP. Specifically, we reformulate the UAP generation task as a minimax optimization problem and then utilize the instance-specific attack method to solve the minimization problem thereby obtaining better training data for generating UAP. At the same time, we also introduce a consistency regularizer to explore the relationship between training data, thus further improving the dominance of the generated UAP. Furthermore, our method is generic with no additional assumptions about the training data and hence can be applied to both data-dependent (supervised) and data-independent (unsupervised) manners. Extensive experiments demonstrate that the proposed method improves the performance by a significant margin over the existing methods in both data-dependent and data-independent settings. Code is available at https://github.com/lisenxd/AT-UAP.

Towards A Unified Min-Max Framework for Adversarial Exploration and Robustness

Adversarial Attack Generation Empowered by Min-Max Optimization

Towards Desirable Decision Boundary by Moderate-Margin Adversarial Training

Attacks Which Do Not Kill Training Make Adversarial Learning Stronger

Based on Max-Min Framework Transferable Adversarial Attacks

Learning Universal Adversarial Perturbation by Adversarial Example

Theoretical Analysis of Adversarial Learning: A Minimax Approach

Exploring the Adversarial Frontier: Quantifying Robustness via Adversarial Hypervolume

Understanding Robust Overfitting of Adversarial Training and Beyond

Adversarial Distributional Training for Robust Deep Learning

Revisiting Min-Max Optimization Problem in Adversarial Training

Rethinking Softmax Cross-Entropy Loss for Adversarial Robustness.

Minimax rates of convergence for nonparametric regression under adversarial attacks

Balance, Imbalance, and Rebalance: Understanding Robust Overfitting from a Minimax Game Perspective

Strength-Adaptive Adversarial Training

Towards Sharper Risk Bounds for Minimax Problems

Hyper Adversarial Tuning for Boosting Adversarial Robustness of Pretrained Large Vision Models

Improving Adversarial Robustness Requires Revisiting Misclassified Examples.

On Model Robustness Against Adversarial Examples

Optimization and Optimizers for Adversarial Robustness

Optimizing Latent Variables in Integrating Transfer and Query Based Attack Framework