Attention-SA: Exploiting Model-approximated Data Semantics for Adversarial Attack

Qian Li,Qingyuan Hu,Haoran Fan,Chenhao Lin,Chao Shen,Libing Wu
DOI: https://doi.org/10.1109/tifs.2024.3409945
2024-01-01
Abstract:Adversarial Defense of deep neural networks have gained significant attention and there have been active research efforts on model vulnerabilities for attacking such as gradient-based attack and pre-defined semantic manipulation. However, they often lack clear adversarial pattern connecting model extracted notion and are restricted to fixed constraint, making the gradual inability to proposed robust defense. In this paper, we propose to utilize the learned semantics of model, possibly not be the true one for the correct prediction, as inspiring clue in adversarial example construction. And we propose a new attention-based semantic oriented adversarial attack without any prior constraint about semantic preservation, dubbed Attention-SA from the learned task-related decision factors perspective. Specifically, to capture the learned factor, we introduce a post-hoc soft attention with a gradient-sensitivity activation consistency to probe the information of latent representation that bridge the input and prediction. With the attention guidance, we perturb the separated and semantic units, then back-propagate the variation onto input to discover expanded adversarial examples. Finally, extensive performance evaluations on CIFAR-10 and ImageNet datasets demonstrate the superiority of our proposed method. And we verify the effectiveness of our method on various robust defenses.
What problem does this paper attempt to address?