Exploring Transferable and Robust Adversarial Perturbation Generation Across Network Hierarchy

Ruikui Wang,Yuanfang Guo,Ruijie Yang,Yunhong Wang
DOI: https://doi.org/10.1016/j.neucom.2024.128620
IF: 6
2024-01-01
Neurocomputing
Abstract:The transferability and robustness of adversarial examples are two practical and important properties for black- box adversarial attacks. In this paper, we explore effective mechanisms to boost both of them across network hierarchy. In general, a typical network can be hierarchically divided into output stage, intermediate stage and input stage. Due to the over-specialization of the substitute model, we can hardly improve the transferability and robustness of the adversarial perturbations in the output stage. Therefore, we focus on manipulating the intermediate and input stages in this paper, and propose a Transferable and Robust Adversarial Perturbation generation (TRAP) method. Specifically, we propose the dynamically guided mechanism to continuously calculate accurate directional guidances for perturbation generation in the intermediate stage. In the input stage, instead of employing the single-form transformation augmentations adopted in the existing methods, we leverage multi-form affine transformation augmentations to enrich the input diversity and simultaneously boost the robustness and transferability of the adversarial perturbations. Extensive evaluations on ImageNet validation set demonstrate that our TRAP achieves superior transferability when attacking convolution neural networks (CNNs) and vision transformers (ViTs) compared to closely related state-of-the-art methods. For instance, based on the ResNet-101 model, we achieve an average attack success rate of 97.5% on black-box CNN models and 70.1% on ViT models, respectively. Moreover, TRAP exhibits robust performance against various physical-world interferences, such as Gaussian blurring, Gaussian noise, JPEG compression, color distortions, image erosion and image dilation. Additionally, we also show the potential application of our TRAP method for proactive defense against deepfake.
What problem does this paper attempt to address?