Abstract: Although great progress has been made on adversarial attacks for deep neural networks (DNNs), their transferability is still unsatisfactory, especially for targeted attacks. There are two problems behind that have been long overlooked: 1) the conventional setting of $T$ iterations with the step size of $\epsilon/T$ to comply with the $\epsilon$-constraint. In this case, most of the pixels are allowed to add very small noise, much less than $\epsilon$; and 2) usually manipulating pixel-wise noise. However, features of a pixel extracted by DNNs are influenced by its surrounding regions, and different DNNs generally focus on different discriminative regions in recognition. To tackle these issues, we propose a patch-wise iterative method (PIM) aimed at crafting adversarial examples with high transferability. Specifically, we introduce an amplification factor to the step size in each iteration, and one pixel's overall gradient overflowing the $\epsilon$-constraint is properly assigned to its surrounding regions by a project kernel. But targeted attacks aim to push the adversarial examples into the territory of a specific class, and the amplification factor may lead to underfitting. Thus, we introduce the temperature and propose a patch-wise++ iterative method (PIM++) to further improve transferability without significantly sacrificing the performance of the white-box attack. Our method can be generally integrated to any gradient-based attack method. Compared with the current state-of-the-art attack methods, we significantly improve the success rate by 35.9\% for defense models and 32.7\% for normally trained models on average.

Diffusion Patch Attack with Spatial-Temporal Cross-Evolution for Video Recognition

Efficient Decision-based Black-box Patch Attacks on Video Recognition

Universal 3-Dimensional Perturbations for Black-Box Attacks on Video Recognition Systems

Imperceptible Adversarial Attack with Multi-granular Spatio-temporal Attention for Video Action Recognition

Imperceptible Adversarial Attack with Multigranular Spatiotemporal Attention for Video Action Recognition

Query-Efficient Decision-based Black-Box Patch Attack

Sparse Adversarial Video Attacks Via Superpixel-Based Jacobian Computation

Patch-wise++ Perturbation for Adversarial Targeted Attacks

Boosting the Transferability of Video Adversarial Examples Via Temporal Translation.

Attacking Video Recognition Models with Bullet-Screen Comments

Black-box Adversarial Attacks on Video Recognition Models

Transferable Black-Box Attack against Face Recognition with Spatial Mutable Adversarial Patch

DIFFender: Diffusion-Based Adversarial Defense against Patch Attacks

Channel-augmented Joint Transformation for Transferable Adversarial Attacks

Temporal-Distributed Backdoor Attack Against Video Based Action Recognition

Heuristic Black-box Adversarial Attacks on Video Recognition Models

Cube-Evo: A Query-Efficient Black-Box Attack on Video Classification System

Natural Adversarial Patch Generation Method Based on Latent Diffusion Model

Diffusion Models for Imperceptible and Transferable Adversarial Attack

Reinforcement Learning Based Sparse Black-box Adversarial Attack on Video Recognition Models