STBA: Towards Evaluating the Robustness of DNNs for Query-Limited Black-box Scenario

Renyang Liu,Kwok-Yan Lam,Wei Zhou,Sixing Wu,Jun Zhao,Dongting Hu,Mingming Gong
2024-10-23
Abstract:Many attack techniques have been proposed to explore the vulnerability of DNNs and further help to improve their robustness. Despite the significant progress made recently, existing black-box attack methods still suffer from unsatisfactory performance due to the vast number of queries needed to optimize desired perturbations. Besides, the other critical challenge is that adversarial examples built in a noise-adding manner are abnormal and struggle to successfully attack robust models, whose robustness is enhanced by adversarial training against small perturbations. There is no doubt that these two issues mentioned above will significantly increase the risk of exposure and result in a failure to dig deeply into the vulnerability of DNNs. Hence, it is necessary to evaluate DNNs' fragility sufficiently under query-limited settings in a non-additional way. In this paper, we propose the Spatial Transform Black-box Attack (STBA), a novel framework to craft formidable adversarial examples in the query-limited scenario. Specifically, STBA introduces a flow field to the high-frequency part of clean images to generate adversarial examples and adopts the following two processes to enhance their naturalness and significantly improve the query efficiency: a) we apply an estimated flow field to the high-frequency part of clean images to generate adversarial examples instead of introducing external noise to the benign image, and b) we leverage an efficient gradient estimation method based on a batch of samples to optimize such an ideal flow field under query-limited settings. Compared to existing score-based black-box baselines, extensive experiments indicated that STBA could effectively improve the imperceptibility of the adversarial examples and remarkably boost the attack success rate under query-limited settings.
Computer Vision and Pattern Recognition,Image and Video Processing
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to evaluate the robustness of deep neural networks (DNNs) in the query - limited black - box scenario. Specifically, there are two main problems in the existing black - box attack methods when generating adversarial samples: 1. **Low query efficiency**: Most of the existing black - box attack methods require a large number of query times to optimize adversarial samples, which not only increases the computational burden but also is easily detected by defenders. 2. **Poor naturality of adversarial samples**: Adversarial samples generated by adding noise are often unnatural and difficult to successfully attack robust models trained with adversarial training. To solve these problems, the paper proposes a new black - box attack framework - Spatial Transform Black - box Attack (STBA). The main contributions of STBA include: - **Introducing spatial transformation techniques**: STBA uses spatial transformation techniques instead of directly adding noise to generate adversarial samples, thereby improving the stealth and naturality of adversarial samples. - **Processing of high - frequency parts**: STBA decomposes the image into high - frequency and low - frequency parts and applies the estimated flow field only to the high - frequency part to generate more natural adversarial samples. - **Efficient gradient estimation method**: STBA utilizes an efficient gradient estimation method to optimize the flow field in the query - limited situation, thereby significantly improving the attack success rate and query efficiency. Through these methods, STBA can generate high - quality adversarial samples within a limited number of query times and has achieved a high attack success rate on multiple datasets.