Abstract:Recent studies show that deep neural networks are vulnerable to adversarial attacks in the form of subtle perturbations to the input image, which leads the model to output wrong prediction. Such an attack can easily succeed by the existing white-box attack methods, where the perturbation is calculated based on the gradient of the target network. Unfortunately, the gradient is often unavailable in the real-world scenarios, which makes the black-box adversarial attack problems practical and challenging. In fact, they can be formulated as high-dimensional black-box optimization problems at the pixel level. Although evolutionary algorithms are well known for solving black-box optimization problems, they cannot efficiently deal with the high-dimensional decision space. Therefore, we propose an approximated gradient sign method using differential evolution (DE) for solving black-box adversarial attack problems. Unlike most existing methods, it is novel that the proposed method searches the gradient sign rather than the perturbation by a DE algorithm. Also, we transform the pixel-based decision space into a dimension-reduced decision space by combining the pixel differences from the input image to neighbor images, and two different techniques for selecting neighbor images are introduced to build the transferred decision space. In addition, six variants of the proposed method are designed according to the different neighborhood selection and optimization search strategies. Finally, the performance of the proposed method is compared with a number of the state-of-the-art adversarial attack algorithms on CIFAR-10 and ImageNet datasets. The experimental results suggest that the proposed method shows superior performance for solving black-box adversarial attack problems, especially nontargeted attack problems.

Universal Perturbation Generation for Black-box Attack Using Evolutionary Algorithms

Evolution Attack On Neural Networks

An Universal Perturbation Generator for Black-Box Attacks Against Object Detectors.

Universal Adversarial Perturbation Generated by Attacking Layer-wise Relevance Propagation

Art-Attack: Black-Box Adversarial Attack via Evolutionary Art

A Multi-objective Examples Generation Approach to Fool the Deep Neural Networks in the Black-Box Scenario

A Universal Targeted Attack Method against Image Classification

GenAttack: Practical Black-box Attacks with Gradient-Free Optimization

Image Adversarial Example Generation Method Based on Adaptive Parameter Adjustable Differential Evolution

Decision-based Universal Adversarial Attack

Generating Universal Language Adversarial Examples by Understanding and Enhancing the Transferability Across Neural Models

Improving Transferability of Universal Adversarial Perturbation with Feature Disruption.

ABCAttack: A Gradient-Free Optimization Black-Box Attack for Fooling Deep Image Classifiers

Comparative Evaluation of Recent Universal Adversarial Perturbations in Image Classification

Generalizing universal adversarial perturbations for deep neural networks

Interpreting Universal Adversarial Example Attacks on Image Classification Models.

Adaptive Perturbation for Adversarial Attack

AdvFoolGen: Creating Persistent Troubles for Deep Classifiers

An Approximated Gradient Sign Method Using Differential Evolution For Black-box Adversarial Attack

DCVAE-adv: A Universal Adversarial Example Generation Method for White and Black Box Attacks

Towards cross-task universal perturbation against black-box object detectors in autonomous driving