A Universal Targeted Attack Method against Image Classification

Huili Luo,Zhaoquan Gu,Chunajing Zhang,Le Wang,Shuhao Li,Zhiqin Chen
DOI: https://doi.org/10.1109/DSC53577.2021.00032
2021-01-01
Abstract:Rapid advancement in deep neural networks (DNNs) has enhanced the prospect of image classification technologies like facial recognition. The emergence of adversarial examples, however, revealed the vulnerability of DNNs. A large number of attack methods have been proposed against DNNs and these methods can be divided into two types: targeted attacks and untargeted attacks. Specifically, targeted attacks misclassify the detection object into a specified target, whereas untargeted attacks only aim to lead the classifier (a DNN classifier, for instance) to misclassification of the object by adding adversarial examples. Most adversarial attack methods add different perturbations to different images, which is costly and inefficient. In this paper, we explore a universal targeted attack method against image classification, in which a universal perturbation is fabricated and added to all images to generate adversarial examples. These generated adversarial examples are classified by the classifier as a specified target with high probability. Through black-box and white-box attack experiments against the ResNet model on both CIFAR 10 and CIFAR100 datasets, we verified the effectiveness of our method.
What problem does this paper attempt to address?