Abstract:Transferability of adversarial examples is critical for black-box deep learning model attacks. While most existing studies focus on enhancing the transferability of untargeted adversarial attacks, few of them studied how to generate transferable targeted adversarial examples that can mislead models into predicting a specific class. Moreover, existing transferable targeted adversarial attacks usually fail to sufficiently characterize the target class distribution, thus suffering from limited transferability. In this paper, we propose the Transferable Targeted Adversarial Attack (TTAA), which can capture the distribution information of the target class from both label-wise and feature-wise perspectives, to generate highly transferable targeted adversarial examples. To this end, we design a generative adversarial training framework consisting of a generator to produce targeted adversarial examples, and feature-label dual discriminators to distinguish the generated adversarial examples from the target class images. Specifically, we design the label discriminator to guide the adversarial examples to learn label-related distribution information about the target class. Meanwhile, we design a feature discriminator, which extracts the feature-wise information with strong cross-model consistency, to enable the adversarial examples to learn the transferable distribution information. Furthermore, we introduce the random perturbation dropping to further enhance the transferability by augmenting the diversity of adversarial examples used in the training process. Experiments demonstrate that our method achieves excellent performance on the transferability of targeted adversarial examples. The targeted fooling rate reaches 95.13% when transferred from VGG-19 to DenseNet-121, which significantly outperforms the state-of-the-art methods.

Push & Pull: Transferable Adversarial Examples With Attentive Attack

You See What I Want You to See: Exploring Targeted Black-Box Transferability Attack for Hash-based Image Retrieval Systems

Towards Transferable Targeted Adversarial Examples

Towards Transferable Targeted Attack.

Evading Defenses to Transferable Adversarial Examples by Mitigating Attention Shift

Transferable Physical Attack against Object Detection with Separable Attention

LFAA: Crafting Transferable Targeted Adversarial Examples with Low-Frequency Perturbations

Enhancing Transferability of Targeted Adversarial Examples: A Self-Universal Perspective

Revisiting Transferable Adversarial Image Examples: Attack Categorization, Evaluation Guidelines, and New Insights

Improving Transferable Targeted Attacks with Feature Tuning Mixup

Toward Understanding and Boosting Adversarial Transferability from a Distribution Perspective

Transferable Adversarial Attacks for Image and Video Object Detection

Generate Transferable Adversarial Physical Camouflages via Triplet Attention Suppression

Towards Transferable Targeted 3D Adversarial Attack in the Physical World

Understanding and Enhancing the Transferability of Adversarial Examples

Evading Defenses to Transferable Adversarial Examples by Translation-Invariant Attacks

Transferable Adversarial Attacks for Object Detection Using Object-Aware Significant Feature Distortion

Adaptive momentum variance for attention-guided sparse adversarial attacks

Delving into Transferable Adversarial Examples and Black-box Attacks

Towards Transferable Unrestricted Adversarial Examples with Minimum Changes

Attentional Feature Erase: Towards task-wise transferable adversarial attack on cloud vision APIs