Attention, Please! Adversarial Defense via Activation Rectification and Preservation

Shangxi Wu,Jitao Sang,Kaiyuan Xu,Jiaming Zhang,Jian Yu
DOI: https://doi.org/10.1145/3572843
2023-02-27
Abstract:This study provides a new understanding of the adversarial attack problem by examining the correlation between adversarial attack and visual attention change. In particular, we observed that: (1) images with incomplete attention regions are more vulnerable to adversarial attacks; and (2) successful adversarial attacks lead to deviated and scattered activation map. Therefore, we use the mask method to design an attention-preserving loss and a contrast method to design a loss that makes the model’s attention rectification. Accordingly, an attention-based adversarial defense framework is designed, under which better adversarial training or stronger adversarial attacks can be performed through the above constraints. We hope the attention-related data analysis and defense solution in this study will shed some light on the mechanism behind the adversarial attack and also facilitate future adversarial defense/attack model design.
computer science, information systems, theory & methods, software engineering
What problem does this paper attempt to address?