Boosting the Transferability of Adversarial Examples Via Adaptive Attention and Gradient Purification Methods

Liwen Wu,Lei Zhao,Bin Pu,Yi Zhao,Xin Jin,Shaowen Yao
DOI: https://doi.org/10.1109/ijcnn60899.2024.10651160
2024-01-01
Abstract:Deep neural networks are shown to be vulnerable to adversarial examples. Recently, various methods have been proposed to improve the transferability of adversarial examples. However, most of the existing methods add perturbations to the whole image without discrimination, causing the visual quality of the adversarial examples to degrade drastically. In addition, existing attack methods ignore the gradient information of secondary features, which affects the accuracy of generating adversarial perturbations. In this work, we propose Adaptive Attention and Gradient Purification Attack (AAGP) to address such issues. Specifically, we judge the mean and standard deviation of the gradient values to find out where the model is interested. Since different models share similar regions of attention, adding perturbations only to such areas can reduce the addition of adversarial perturbation and can also lead to better transferability of adversarial examples to other models. In addition, we disrupt the correlation of pixels at the distribution of secondary features by random discarding pixels in low-attention areas, generating more transferable perturbations through more accurate gradient information. Experimental results on ImageNet show that our method enhances the visibility of the adversarial examples and their transferability compared with several advanced baselines.
What problem does this paper attempt to address?