Abstract:Recent studies have shown that machine learning algorithms are susceptible to imperceptible perturbations. These studies focus on the laboratory environment, where the attacker has knowledge about internal information of the victim model or feedbacks such as class probabilities. There is still a gap between theory and physical world, the risk of adversarial attacks under the more extreme and realistic condition needs to be figured out. Here we propose a knowledge restricted black-box attack model where the attacker can only get the final predict label. In the meantime, we model the attacker as a resource-restricted one, such as query-limited. The limitations of knowledge level and resources make previous work unable to be directly applied. For this problem, the current state-of-the-art method is boundary attack, however it requires large a number of queries. In this paper, we make several contributions to investigate the vulnerability of machine learning models in more realistic scenarios. First, we reconstruct the optimization problem, measure the quality of the sample points by L2 distance. Second, we provide a more effective algorithm, using cutting plane method and local optimization. Third, we propose two effective dynamic defense strategy, which is easy to implement. At last, we conduct an experimental evaluation on MNIST, Fashion-MNIST and malware detection datasets. The results show that (1) compared with state-of-the-art method, our cutting plane method reduces the number of queries while ensuring the attack efficiency; (2) Dynamic defense strategy is effective against label-only adversarial attacks, the rate of attack success dropped from nearly 100% to 23%, with a considerable classification accuracy; (3) Improved defense strategy guarantees the effectiveness of defense and improves the stability of the whole model.

Spanning Attack: Reinforce Black-Box Attacks with Unlabeled Data

Blindfolded Attackers Still Threatening: Strict Black-Box Adversarial Attacks on Graphs.

Towards Efficient Data Free Blackbox Adversarial Attack

Query-free Black-box Adversarial Attacks on Graphs

Improving Query Efficiency of Black-box Adversarial Attack

Query-efficient label-only attacks against black-box machine learning models

Policy-Driven Attack: Learning to Query for Hard-label Black-box Adversarial Examples.

BruSLeAttack: A Query-Efficient Score-Based Black-Box Sparse Adversarial Attack

STBA: Towards Evaluating the Robustness of DNNs for Query-Limited Black-box Scenario

SemiAdv: Query-Efficient Black-Box Adversarial Attack with Unlabeled Images

Hard-Label Black-Box Attacks on 3D Point Clouds

Subspace Attack: Exploiting Promising Subspaces for Query-Efficient Black-box Attacks.

Adversarial Eigen Attack on BlackBox Models

Scaling Laws for Black box Adversarial Attacks

Data-Free Adversarial Perturbations for Practical Black-Box Attack

A Restricted Black-Box Adversarial Framework Towards Attacking Graph Embedding Models.

Query Efficient Decision Based Sparse Attacks Against Black-Box Deep Learning Models

Adversarial Attacks with Time-Scale Representations

Towards Lightweight Black-Box Attacks against Deep Neural Networks

Heuristic Black-box Adversarial Attacks on Video Recognition Models