Push & Pull: Transferable Adversarial Examples With Attentive Attack

Lianli Gao,Zijie Huang,Jingkuan Song,Yang Yang,Heng Tao Shen
DOI: https://doi.org/10.1109/tmm.2021.3079723
IF: 7.3
2021-01-01
IEEE Transactions on Multimedia
Abstract:Targeted attack aims to mislead the classification model to a specific class, and it can be further divided into black-box and white-box targeted attack depending on whether the classification model is known. A growing number of approaches rely on disrupting the image representations to craft adversarial examples. However, this type of methods often suffer from either low white-box targeted attack success rate or poor black-box targeted attack transferability. To address these problems, we propose a Transferable Attentive Attack (TAA) method which adds perturbation to clean images based on the attended regions and features. This is motivated by one important observation that deep-learning based classification models (or even shallow-learning based models like SIFT) make the prediction mainly based on the informative and discriminative regions of an image. Specifically, the corresponding features of the informative regions are firstly extracted, and the anchor image's features are iteratively "pushed" away from the source class and simultaneously "pulled" closer to the target class along with attacking. Moreover, we introduce a new strategy that the attack selects the centroids of source and target class cluster as the input of triplet loss to achieve high transferability. Experimental results demonstrate that our method improves the transferability of adversarial example, while maintaining higher success rate for white-box targeted attacks compared with the state-of-the-arts. In particular, TAA attacks on image-representation based task like VQA also result in a significant performance drop in terms of accuracy.
computer science, information systems,telecommunications, software engineering
What problem does this paper attempt to address?