Abstract:Y Recent studies demonstrated that deep neural networks (DNNs) are vulnerable to adversarial examples, which would seriously threaten security-sensitive applications. Existing works synthesized the adversarial examples by perturbing the original/benign images by leveraging the L-p-norm to penalize the perturbations, which restricts the pixel-wise distance between the adversarial images and correspondingly benign images. However, they added perturbations globally to the benign images without explicitly considering their content/spacial structure, resulting in noticeable artifacts especially in those originally clean regions, e.g., sky and smooth surface. In this paper, we propose an invisible adversarial attack, which synthesizes adversarial examples that are visually indistinguishable from benign ones. We adaptively distribute the perturbation according to human sensitivity to a local stimulus in the benign image, i.e., the higher insensitivity, the more perturbation. Two types of adaptive adversarial attacks are proposed: 1) coarse-grained and 2) fine-grained. The former conducts L-p-norm regularized by the novel spatial constraints, which utilizes the rich information of the cluttered regions to mask perturbation. The latter, called Just Noticeable Distortion (JND)-based adversarial attack, utilizes the proposed JND(p) metric for better measuring the perceptual similarity, and adaptively sets penalty by weighting the pixel-wise perceptual redundancy of an image. We conduct extensive experiments on the MNIST, CIFAR-10 and ImageNet datasets and a comprehensive user study with 50 participants. The experimental results demonstrate that JND(p) is a better metric for measuring the perceptual similarity than L-p-norm, and the proposed adaptive adversarial attacks can synthesize indistinguishable adversarial examples from benign ones and outperform the state-of-the-art methods.

Undetectable Adversarial Examples Based on Microscopical Regularization.

Invisible Adversarial Attack Against Deep Neural Networks: an Adaptive Penalization Approach

Towards Imperceptible and Robust Adversarial Example Attacks Against Neural Networks

Adversarial Examples Detection Beyond Image Space.

Adversarial Attacks Hidden in Plain Sight

Towards Robustness against Unsuspicious Adversarial Examples

Simple Transparent Adversarial Examples

Robust Superpixel-Guided Attentional Adversarial Attack

The core structure of the lipopolysaccharide from the causative agent of plague, Yersinia pestis.

Demiguise Attack: Crafting Invisible Semantic Adversarial Perturbations with Perceptual Similarity

Detecting Localized Adversarial Examples: A Generic Approach Using Critical Region Analysis

On the (Statistical) Detection of Adversarial Examples

Natural Language Induced Adversarial Images

Detection of Adversarial Attacks via Disentangling Natural Images and Perturbations

ATMPA: Attacking Machine Learning-based Malware Visualization Detection Methods via Adversarial Examples

Towards Robust Training of Neural Networks by Regularizing Adversarial Gradients

Robust Adversarial Examples Against Scale Transformation Via Generative Network

Adversarial Examples Detection with Enhanced Image Difference Features based on Local Histogram Equalization

Sparse and Imperceivable Adversarial Attacks

RetouchUAA: Unconstrained Adversarial Attack via Image Retouching

Analyzing Adversarial Robustness of Deep Neural Networks in Pixel Space: a Semantic Perspective