Abstract:Saliency detection is an effective front-end process to many security-related tasks, e.g. automatic drive and tracking. Adversarial attack serves as an efficient surrogate to evaluate the robustness of deep saliency models before they are deployed in real world. However, most of current adversarial attacks exploit the gradients spanning the entire image space to craft adversarial examples, ignoring the fact that natural images are high-dimensional and spatially over-redundant, thus causing expensive attack cost and poor perceptibility. To circumvent these issues, this paper builds an efficient bridge between the accessible partially-white-box source models and the unknown black-box target models. The proposed method includes two steps: 1) We design a new partially-white-box attack, which defines the cost function in the compact hidden space to punish a fraction of feature activations corresponding to the salient regions, instead of punishing every pixel spanning the entire dense output space. This partially-white-box attack reduces the redundancy of the adversarial perturbation. 2) We exploit the non-redundant perturbations from some source models as the prior cues, and use an iterative zeroth-order optimizer to compute the directional derivatives along the non-redundant prior directions, in order to estimate the actual gradient of the black-box target model. The non-redundant priors boost the update of some "critical" pixels locating at non-zero coordinates of the prior cues, while keeping other redundant pixels locating at the zero coordinates unaffected. Our method achieves the best tradeoff between attack ability and perturbation redundancy. Finally, we conduct a comprehensive experiment to test the robustness of 18 state-of-the-art deep saliency models against 16 malicious attacks, under both of white-box and black-box settings, which contributes a new robustness benchmark to the saliency community for the first time.

Adversarial Attacks against Deep Saliency Models

Adversarial Attack Against Deep Saliency Models Powered by Non-Redundant Priors

Demons Hidden in the Light: Unrestricted Adversarial Illumination Attacks

Detection defense against adversarial attacks with saliency map

SAD: Saliency-based Defenses Against Adversarial Examples

Towards Robustness against Unsuspicious Adversarial Examples

Focus-Shifting Attack: an Adversarial Attack That Retains Saliency Map Information and Manipulates Model Explanations

Adversarial example detection based on saliency map features

Object-Attentional Untargeted Adversarial Attack

Detecting Adversarial Perturbations with Saliency

Adversarial Examples Detection Beyond Image Space.

Stealthy Adversarial Examples for Semantic Segmentation in Remote Sensing

Defense against adversarial attacks based on color space transformation

To Make Yourself Invisible with Adversarial Semantic Contours

Robust Superpixel-Guided Attentional Adversarial Attack

Saliency Attention and Semantic Similarity-Driven Adversarial Perturbation

Attack-SAM: Towards Attacking Segment Anything Model With Adversarial Examples

Adversarial scratches: Deployable attacks to CNN classifiers

Investigating Human-Identifiable Features Hidden in Adversarial Perturbations

Adversarial Attacks for Embodied Agents.

Dual Attention Suppression Attack: Generate Adversarial Camouflage in Physical World