Abstract:Y Recent studies demonstrated that deep neural networks (DNNs) are vulnerable to adversarial examples, which would seriously threaten security-sensitive applications. Existing works synthesized the adversarial examples by perturbing the original/benign images by leveraging the L-p-norm to penalize the perturbations, which restricts the pixel-wise distance between the adversarial images and correspondingly benign images. However, they added perturbations globally to the benign images without explicitly considering their content/spacial structure, resulting in noticeable artifacts especially in those originally clean regions, e.g., sky and smooth surface. In this paper, we propose an invisible adversarial attack, which synthesizes adversarial examples that are visually indistinguishable from benign ones. We adaptively distribute the perturbation according to human sensitivity to a local stimulus in the benign image, i.e., the higher insensitivity, the more perturbation. Two types of adaptive adversarial attacks are proposed: 1) coarse-grained and 2) fine-grained. The former conducts L-p-norm regularized by the novel spatial constraints, which utilizes the rich information of the cluttered regions to mask perturbation. The latter, called Just Noticeable Distortion (JND)-based adversarial attack, utilizes the proposed JND(p) metric for better measuring the perceptual similarity, and adaptively sets penalty by weighting the pixel-wise perceptual redundancy of an image. We conduct extensive experiments on the MNIST, CIFAR-10 and ImageNet datasets and a comprehensive user study with 50 participants. The experimental results demonstrate that JND(p) is a better metric for measuring the perceptual similarity than L-p-norm, and the proposed adaptive adversarial attacks can synthesize indistinguishable adversarial examples from benign ones and outperform the state-of-the-art methods.

Exploiting the Inherent Limitation of L0 Adversarial Examples

Exploiting the Sensitivity of $L_2$ Adversarial Examples to Erase-and-Restore

Nowhere to Hide: A Lightweight Unsupervised Detector against Adversarial Examples

Adversarial Examples Detection Through the Sensitivity in Space Mappings.

Adversarial Examples: Opportunities and Challenges

Adversarial example detection for DNN models: a review and experimental comparison

HAWKEYE: Adversarial Example Detector for Deep Neural Networks

Poster: Detecting Adversarial Examples Hidden under Watermark Perturbation via Usable Information Theory

Imperceptible Adversarial Attack on Deep Neural Networks from Image Boundary

Invisible Adversarial Attack Against Deep Neural Networks: an Adaptive Penalization Approach

EAD: Elastic-Net Attacks to Deep Neural Networks via Adversarial Examples

Adversarial Examples Detection Beyond Image Space.

EagleEye: Attack-Agnostic Defense against Adversarial Inputs (Technical Report)

MixDefense: A Defense-in-Depth Framework for Adversarial Example Detection Based on Statistical and Semantic Analysis

Undetectable Adversarial Examples Based on Microscopical Regularization.

Towards Imperceptible and Robust Adversarial Example Attacks Against Neural Networks

You Cannot Easily Catch Me: A Low-Detectable Adversarial Patch for Object Detectors

Adversarial perturbation denoising utilizing common characteristics in deep feature space

Detecting Adversarial Examples Via Reconstruction-based Semantic Inconsistency

Adversarial Examples on Object Recognition: A Comprehensive Survey