Abstract:Backdoor attacks aim to inject backdoors to victim machine learning models during training time, such that the backdoored model maintains the prediction power of the original model towards clean inputs and misbehaves towards backdoored inputs with the trigger. The reason for backdoor attacks is that resource-limited users usually download sophisticated models from model zoos or query the models from MLaaS rather than training a model from scratch, thus a malicious third party has a chance to provide a backdoored model. In general, the more precious the model provided (i.e., models trained on rare datasets), the more popular it is with users. In this article, from a malicious model provider perspective, we propose a black-box backdoor attack, named B 3 , where neither the rare victim model (including the model architecture, parameters, and hyperparameters) nor the training data is available to the adversary. To facilitate backdoor attacks in the black-box scenario, we design a cost-effective model extraction method that leverages a carefully constructed query dataset to steal the functionality of the victim model with a limited budget. As the trigger is key to successful backdoor attacks, we develop a novel trigger generation algorithm that intensifies the bond between the trigger and the targeted misclassification label through the neuron with the highest impact on the targeted label. Extensive experiments have been conducted on various simulated deep learning models and the commercial API of Alibaba Cloud Compute Service. We demonstrate that B 3 has a high attack success rate and maintains high prediction accuracy for benign inputs. It is also shown that B 3 is robust against state-of-the-art defense strategies against backdoor attacks, such as model pruning and NC.

Backdoor Attack and Defense on Deep Learning: A Survey

Backdoor Attacks and Defenses in Federated Learning: State-of-the-Art, Taxonomy, and Future Directions

Backdoor Attacks to Deep Learning Models and Countermeasures: A Survey

B3: Backdoor Attacks Against Black-box Machine Learning Models

Backdoor Attacks and Countermeasures on Deep Learning: A Comprehensive Review

Backdoor Learning: A Survey.

Backdoor Attacks to Deep Neural Networks: A Survey of the Literature, Challenges, and Future Research Directions

Escaping Backdoor Attack Detection of Deep Learning

Survey on Backdoor Attacks and Countermeasures in Deep Neural Network

Backdoor Attacks and Defenses Targeting Multi-Domain AI Models: A Comprehensive Review

Backdoor Attacks on Image Classification Models in Deep Neural Networks

Backdoor Vulnerabilities in Normally Trained Deep Learning Models

Backdoor Attacks and Defenses for Deep Neural Networks in Outsourced Cloud Environments

Countering Backdoor Attacks in Image Recognition: A Survey and Evaluation of Mitigation Strategies

BackdoorBench: A Comprehensive Benchmark and Analysis of Backdoor Learning

An Overview of Backdoor Attacks Against Deep Neural Networks and Possible Defences

Breaking the False Sense of Security in Backdoor Defense through Re-Activation Attack

Towards Security Threats of Deep Learning Systems: A Survey

AdvDoor: Adversarial Backdoor Attack of Deep Learning System

Privacy And Security Issues In Deep Learning: A Survey

PatchBackdoor: Backdoor Attack against Deep Neural Networks without Model Modification