Abstract:Y Recent studies demonstrated that deep neural networks (DNNs) are vulnerable to adversarial examples, which would seriously threaten security-sensitive applications. Existing works synthesized the adversarial examples by perturbing the original/benign images by leveraging the L-p-norm to penalize the perturbations, which restricts the pixel-wise distance between the adversarial images and correspondingly benign images. However, they added perturbations globally to the benign images without explicitly considering their content/spacial structure, resulting in noticeable artifacts especially in those originally clean regions, e.g., sky and smooth surface. In this paper, we propose an invisible adversarial attack, which synthesizes adversarial examples that are visually indistinguishable from benign ones. We adaptively distribute the perturbation according to human sensitivity to a local stimulus in the benign image, i.e., the higher insensitivity, the more perturbation. Two types of adaptive adversarial attacks are proposed: 1) coarse-grained and 2) fine-grained. The former conducts L-p-norm regularized by the novel spatial constraints, which utilizes the rich information of the cluttered regions to mask perturbation. The latter, called Just Noticeable Distortion (JND)-based adversarial attack, utilizes the proposed JND(p) metric for better measuring the perceptual similarity, and adaptively sets penalty by weighting the pixel-wise perceptual redundancy of an image. We conduct extensive experiments on the MNIST, CIFAR-10 and ImageNet datasets and a comprehensive user study with 50 participants. The experimental results demonstrate that JND(p) is a better metric for measuring the perceptual similarity than L-p-norm, and the proposed adaptive adversarial attacks can synthesize indistinguishable adversarial examples from benign ones and outperform the state-of-the-art methods.

Adversarial Attacks Hidden in Plain Sight

Attack As Defense: Characterizing Adversarial Examples Using Robustness.

Demiguise Attack: Crafting Invisible Semantic Adversarial Perturbations with Perceptual Similarity

Towards Imperceptible and Robust Adversarial Example Attacks Against Neural Networks

Sparse and Imperceivable Adversarial Attacks

Imperceptible Adversarial Attack via Invertible Neural Networks

Simple Transparent Adversarial Examples

Invisible Adversarial Attack Against Deep Neural Networks: an Adaptive Penalization Approach

Improving the Invisibility of Adversarial Examples with Perceptually Adaptive Perturbation

A New Defense Against Adversarial Images: Turning a Weakness into a Strength

Investigating Human-Identifiable Features Hidden in Adversarial Perturbations

Adversarial Examples Detection Beyond Image Space.

Towards Robustness against Unsuspicious Adversarial Examples

Exploiting vulnerabilities of deep neural networks for privacy protection

Perception Improvement for Free: Exploring Imperceptible Black-box Adversarial Attacks on Image Classification

Transparency Attacks: How Imperceptible Image Layers Can Fool AI Perception

A New Kind of Adversarial Example

Adversarial Image Generation by Spatial Transformation in Perceptual Colorspaces

Searching for the Essence of Adversarial Perturbations

Adversarial Camouflage: Hiding Physical-World Attacks with Natural Styles

A General Framework for Adversarial Examples with Objectives