Abstract:Adversarial attacks have threatened the application of deep neural networks in security-sensitive scenarios. Most existing black-box attacks fool the target model by interacting with it many times and producing global perturbations. However, all pixels are not equally crucial to the target model; thus, indiscriminately treating all pixels will increase query overhead inevitably. In addition, existing black-box attacks take clean samples as start points, which also limits query efficiency. In this article, we propose a novel black-box attack framework, constructed on a strategy of dual transferability (DT), to perturb the discriminative areas of clean examples within limited queries. The first kind of transferability is the transferability of model interpretations. Based on this property, we identify the discriminative areas of clean samples for generating local perturbations. The second is the transferability of adversarial examples, which helps us to produce local pre-perturbations for further improving query efficiency. We achieve the two kinds of transferability through an independent auxiliary model and do not incur extra query overhead. After identifying discriminative areas and generating pre-perturbations, we use the pre-perturbed samples as better start points and further perturb them locally in a black-box manner to search the corresponding adversarial examples. The DT strategy is general; thus, the proposed framework can be applied to different types of black-box attacks. We conduct extensive experiments to show that, under various system settings, our framework can significantly improve the query efficiency of existing black-box attacks and attack success rates.

Diversity can be Transferred: Output Diversification for White- and Black-box Attacks

You See What I Want You to See: Exploring Targeted Black-Box Transferability Attack for Hash-based Image Retrieval Systems

Promoting Adversarial Transferability via Dual-Sampling Variance Aggregation and Feature Heterogeneity Attacks

Output Randomization: A Novel Defense for both White-box and Black-box Adversarial Models

Improving the Transferability of Adversarial Examples with Resized-Diverse-Inputs, Diversity-Ensemble and Region Fitting

Improving the Transferability of Targeted Adversarial Examples through Object-Based Diverse Input

Improving Transferability of Adversarial Examples With Input Diversity

Improving the Transferability of Adversarial Examples with Diverse Gradients.

Towards Query-Efficient Black-Box Attacks: A Universal Dual Transferability-Based Framework

Improving the Transferability of Adversarial Examples with Advanced Diversity-Ensemble Method

Effectively Improving Data Diversity of Substitute Training for Data-Free Black-Box Attack

Adv-BDPM: Adversarial Attack Based on Boundary Diffusion Probability Model.

Improving Query Efficiency of Black-box Adversarial Attack

Diversified Adversarial Attacks based on Conjugate Gradient Method

Towards robust neural networks via orthogonal diversity

Enhancing Output Diversity Improves Conjugate Gradient-based Adversarial Attacks

Boosting Adversarial Attack Transferability Via Random Block Shuffle

Diversifying the High-level Features for better Adversarial Transferability

Transferability in Machine Learning: from Phenomena to Black-Box Attacks using Adversarial Samples

Boosting the Transferability of Adversarial Examples via Local Mixup and Adaptive Step Size