Universal Perturbation Generation for Black-box Attack Using Evolutionary Algorithms

Siyu Wang,Yucheng Shi,Yahong Han
DOI: https://doi.org/10.1109/icpr.2018.8546023
2018-01-01
Abstract:Image classifiers based on deep neural networks (DNNs) are vulnerable to tiny, imperceptible perturbations. Maliciously generated adversarial examples can exploit the instability of DNNs and mislead it into outputting a wrong classification result. Prior works showed the transferability of adversarial perturbations between models and between images. In this work, we shed light on the combination of source/target misclassification, black-box attack, and universal perturbation by employing improved evolutionary algorithms. We additionally find that the use of adversarial initialization enhances the efficiency of evolutionary algorithms finding universal perturbations. Experiments demonstrate impressive misclassification rates and surprising transferability for the proposed attack method using different models trained on CIFAR-10 and CIFAR-100 datasets. Our attach method also shows robustness against defensive measures like adversarial training.
What problem does this paper attempt to address?