Abstract:Y Recent studies demonstrated that deep neural networks (DNNs) are vulnerable to adversarial examples, which would seriously threaten security-sensitive applications. Existing works synthesized the adversarial examples by perturbing the original/benign images by leveraging the L-p-norm to penalize the perturbations, which restricts the pixel-wise distance between the adversarial images and correspondingly benign images. However, they added perturbations globally to the benign images without explicitly considering their content/spacial structure, resulting in noticeable artifacts especially in those originally clean regions, e.g., sky and smooth surface. In this paper, we propose an invisible adversarial attack, which synthesizes adversarial examples that are visually indistinguishable from benign ones. We adaptively distribute the perturbation according to human sensitivity to a local stimulus in the benign image, i.e., the higher insensitivity, the more perturbation. Two types of adaptive adversarial attacks are proposed: 1) coarse-grained and 2) fine-grained. The former conducts L-p-norm regularized by the novel spatial constraints, which utilizes the rich information of the cluttered regions to mask perturbation. The latter, called Just Noticeable Distortion (JND)-based adversarial attack, utilizes the proposed JND(p) metric for better measuring the perceptual similarity, and adaptively sets penalty by weighting the pixel-wise perceptual redundancy of an image. We conduct extensive experiments on the MNIST, CIFAR-10 and ImageNet datasets and a comprehensive user study with 50 participants. The experimental results demonstrate that JND(p) is a better metric for measuring the perceptual similarity than L-p-norm, and the proposed adaptive adversarial attacks can synthesize indistinguishable adversarial examples from benign ones and outperform the state-of-the-art methods.

Generate Adversarial Examples by Spatially Perturbing on the Meaningful Area

A Universal Defense Strategy Against Adversarial Attacks Based on Attention-Guided

Adversarial Examples Detection Beyond Image Space.

ADSAttack: an Adversarial Attack Algorithm Via Searching Adversarial Distribution in Latent Space

Generating Adversarial Examples in Limited Queries with Image Encoding and Noise Decoding.

From Spatial to Spectral Domain, a New Perspective for Detecting Adversarial Examples

An Improved Adversarial Example Generating Method with Optimized Spatial Transfrom

Invisible Adversarial Attack Against Deep Neural Networks: an Adaptive Penalization Approach

WBA: A Warping-based Approach to Generating Imperceptible Adversarial Examples

Improved Forward-Backward Propagation To Generate Adversarial Examples

Towards Imperceptible and Robust Adversarial Example Attacks Against Neural Networks

Detecting Localized Adversarial Examples: A Generic Approach Using Critical Region Analysis

Adversarial Attacks on Neural Network with Batch Dimensions Perturbation and Manhattan-Distance Constraints

AdvJND: Generating Adversarial Examples with Just Noticeable Difference

Adversarial Transformation Network with Adaptive Perturbations for Generating Adversarial Examples.

Adversarial Image Generation by Spatial Transformation in Perceptual Colorspaces

Adversarial Examples Detection Based on Error Level Analysis and Space Mapping

Defense against adversarial attacks based on color space transformation

Provable Defenses against Spatially Transformed Adversarial Inputs: Impossibility and Possibility Results

Generating Adversarial Examples with Adversarial Networks

Dual Attention Adversarial Attacks with Limited Perturbations