Abstract:Deep neural networks (DNNs) are shown to be vulnerable to universal adversarial perturbations (UAP), a single quasi-imperceptible perturbation that deceives the DNNs on most input images. The current UAP methods can be divided into data-dependent and data-independent methods. The former exhibits weak transferability in black-box models due to overly relying on model-specific features. The latter shows inferior attack performance in white-box models as it fails to exploit the model's response information to benign images. To address the above issues, this paper proposes a novel universal adversarial attack to generate UAP with strong transferability by disrupting the model-agnostic features (e.g., edges or simple texture), which are invariant to the models. Specifically, we first devise an objective function to weaken the significant channel-wise features and strengthen the less significant channel-wise features, which are partitioned by the designed strategy. Furthermore, the proposed objective function eliminates the dependency on labeled samples, allowing us to utilize out-of-distribution (OOD) data to train UAP. To enhance the attack performance with limited training samples, we exploit the average gradient of the mini-batch input to update the UAP iteratively, which encourages the UAP to capture the local information inside the mini-batch input. In addition, we introduce the momentum term to accumulate the gradient information at each iterative step for the purpose of perceiving the global information over the training set. Finally, extensive experimental results demonstrate that the proposed methods outperform the existing UAP approaches. Additionally, we exhaustively investigate the transferability of the UAP across models, datasets, and tasks.

Improving the Transferability of Adversarial Examples with Separable Positive and Negative Disturbances

An Adversarial Attack Via Feature Contributive Regions

Transferable Adversarial Examples Based on Global Smooth Perturbations

Understanding and Enhancing the Transferability of Adversarial Examples

Boosting the Targeted Transferability of Adversarial Examples via Salient Region & Weighted Feature Drop

Improving Transferability of Adversarial Examples With Input Diversity

Enhancing Transferability of Adversarial Examples with Spatial Momentum

Adaptive momentum variance for attention-guided sparse adversarial attacks

Improving Query Efficiency of Black-box Adversarial Attack

Bag of Tricks to Boost Adversarial Transferability

Improving Adversarial Transferability with Scheduled Step Size and Dual Example

Improving the Transferability of Adversarial Examples via Direction Tuning

Improving Adversarial Transferability via Intermediate-level Perturbation Decay

Improving Transferability of Universal Adversarial Perturbation with Feature Disruption.

Toward Understanding and Boosting Adversarial Transferability from a Distribution Perspective

Improving Adversarial Transferability by Stable Diffusion

Enhancing the Transferability of Adversarial Examples with Noise Reduced Gradient

Evading Defenses to Transferable Adversarial Examples by Translation-Invariant Attacks

Enhancing Cross-task Transferability of Adversarial Examples with Dispersion Reduction

Evading Defenses to Transferable Adversarial Examples by Mitigating Attention Shift

Delving into Transferable Adversarial Examples and Black-box Attacks