BruSLeAttack: A Query-Efficient Score-Based Black-Box Sparse Adversarial Attack

Viet Quoc Vo,Ehsan Abbasnejad,Damith C. Ranasinghe
2024-06-01
Abstract:We study the unique, less-well understood problem of generating sparse adversarial samples simply by observing the score-based replies to model queries. Sparse attacks aim to discover a minimum number-the l0 bounded-perturbations to model inputs to craft adversarial examples and misguide model decisions. But, in contrast to query-based dense attack counterparts against black-box models, constructing sparse adversarial perturbations, even when models serve confidence score information to queries in a score-based setting, is non-trivial. Because, such an attack leads to i) an NP-hard problem; and ii) a non-differentiable search space. We develop the BruSLeAttack-a new, faster (more query-efficient) Bayesian algorithm for the problem. We conduct extensive attack evaluations including an attack demonstration against a Machine Learning as a Service (MLaaS) offering exemplified by Google Cloud Vision and robustness testing of adversarial training regimes and a recent defense against black-box attacks. The proposed attack scales to achieve state-of-the-art attack success rates and query efficiency on standard computer vision tasks such as ImageNet across different model architectures. Our artefacts and DIY attack samples are available on GitHub. Importantly, our work facilitates faster evaluation of model vulnerabilities and raises our vigilance on the safety, security and reliability of deployed systems.
Machine Learning,Cryptography and Security
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to generate sparse adversarial examples by observing the confidence score responses of the model to queries in a black - box query environment. Specifically, the paper focuses on how to generate adversarial examples with the least pixel perturbation (i.e., perturbation under \(l_0\) constraint) given an input image, so that these examples can mislead machine - learning models to make wrong decisions. The challenges of this problem are as follows: 1. **NP - hard problem**: Minimizing the \(l_0\) norm is an NP - hard problem, which means that it is very difficult to find the optimal solution in high - dimensional space. 2. **Non - differentiable search space**: Due to the properties of the \(l_0\) norm, the search space is non - continuous and non - differentiable, which makes it difficult for traditional optimization methods to be directly applied. 3. **Mixed search space**: When generating sparse adversarial examples, it is necessary to determine which pixels need to be perturbed and the specific color values of these pixels at the same time, which leads to the complexity of the search space. To address these problems, the paper proposes a new algorithm - BRUSLEATTACK, which is a fast (more query - efficient) sparse adversarial attack algorithm based on the Bayesian framework. BRUSLEATTACK solves the above problems through the following methods: - **Reducing the dimension of the search space**: By introducing a synthetic color image \(x'\), the color value of each pixel is limited to a discrete space, thereby significantly reducing the dimension of the search space. - **Bayesian framework**: Use the information of historical pixel operations to learn the influence of each pixel and guide the selection of new pixels accordingly, so as to more effectively find sparse adversarial examples. - **Pixel dissimilarity map**: By calculating the pixel dissimilarity between the source image and the synthetic color image, the process of moving the adversarial example towards the decision boundary is accelerated. The paper verifies the effectiveness of BRUSLEATTACK through experiments on multiple datasets (such as CIFAR - 10, STL - 10, and ImageNet). The experimental results show that BRUSLEATTACK can significantly improve the attack success rate (ASR) under different query budgets and sparsity levels, and in terms of adversarial sparse attacks, Vision Transformer (ViT) shows higher robustness than convolutional neural networks (CNN).