Abstract:Backdoor attacks aim to inject a backdoor into a classifier such that it predicts any input with an attacker-chosen backdoor trigger as an attacker-chosen target class. Existing backdoor attacks require either retraining the classifier with some clean data or modifying the model's architecture. As a result, they are 1) not applicable when clean data is unavailable, 2) less efficient when the model is large, and 3) less stealthy due to architecture changes. In this work, we propose DFBA, a novel retraining-free and data-free backdoor attack without changing the model architecture. Technically, our proposed method modifies a few parameters of a classifier to inject a backdoor. Through theoretical analysis, we verify that our injected backdoor is provably undetectable and unremovable by various state-of-the-art defenses under mild assumptions. Our evaluation on multiple datasets further demonstrates that our injected backdoor: 1) incurs negligible classification loss, 2) achieves 100% attack success rates, and 3) bypasses six existing state-of-the-art defenses. Moreover, our comparison with a state-of-the-art non-data-free backdoor attack shows our attack is more stealthy and effective against various defenses while achieving less classification accuracy loss.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is to inject a backdoor attack into a pre - trained classifier without retraining or modifying the model architecture. Specifically, existing backdoor attack methods usually need to access some clean data to retrain the model or need to change the model architecture to inject the backdoor. These methods have the following problems: 1. **Unavailable data**: When no clean data is available, existing methods are not applicable. 2. **Low efficiency**: For large - scale models, retraining is very time - consuming and resource - intensive. 3. **Poor concealment**: Changing the model architecture may be detected, thus reducing the concealment of the attack. For this reason, the paper proposes a new data - free, no - retraining backdoor attack method (DFBA), which injects the backdoor by directly modifying the model parameters without accessing any data or changing the model architecture. The main contributions of the paper include: - **Proposing a data - free, no - retraining backdoor attack for the first time**: DFBA directly modifies the parameters of the classifier to inject the backdoor without accessing any data or changing the model architecture. - **Theoretical analysis**: It is proved that the backdoor injected by DFBA is undetectable and unremovable under a variety of existing defense methods. - **Comprehensive evaluation**: Experiments were carried out on multiple benchmark datasets, demonstrating the effectiveness and efficiency of DFBA. - **Resistance to existing defense methods**: Experiments prove that DFBA can bypass six state - of - the - art defense methods and is not sensitive to the selection of hyper - parameters. ### Paper Abstract The paper proposes DFBA, a new data - free, no - retraining backdoor attack method that can inject a backdoor into a pre - trained classifier without changing the model architecture. Through theoretical analysis, it is proved that the injected backdoor is undetectable and unremovable under a variety of existing defense methods. Experimental results show that DFBA can achieve a 100% attack success rate on multiple datasets, while the classification accuracy loss for clean test inputs is less than 3% and can bypass six state - of - the - art defense methods. Compared with existing non - data - free backdoor attacks, DFBA is more concealed and more effective. ### Main Contributions - **Data - free, no - retraining backdoor attack**: DFBA directly modifies the parameters of the classifier to inject the backdoor without accessing any data or changing the model architecture. - **Theoretical guarantee**: It is proved that the backdoor injected by DFBA is undetectable and unremovable under a variety of existing defense methods. - **Comprehensive evaluation**: Experiments were carried out on multiple benchmark datasets, demonstrating the effectiveness and efficiency of DFBA. - **Resistance to existing defense methods**: Experiments prove that DFBA can bypass six state - of - the - art defense methods and is not sensitive to the selection of hyper - parameters. ### Technical Details DFBA injects the backdoor through the following steps: 1. **Select neurons**: Select one neuron from each layer to form a backdoor path. 2. **Design backdoor switch**: Modify the parameters of the first neuron so that it is activated when there is a backdoor input and hardly activated when there is a clean input. 3. **Amplify the output of the backdoor switch**: By modifying the parameters of the intermediate - layer neurons, gradually amplify the output of the backdoor switch. 4. **Adjust the weights of the output layer**: Modify the weights of the output - layer neurons so that the output of the backdoor path has a positive contribution to the target class and a negative contribution to the non - target class. ### Theoretical Analysis - **Utility analysis**: It is proved that when the classifier with the backdoor injected by DFBA processes clean inputs, its output is the same as that of the pruned classifier, thus maintaining the classification accuracy. - **Effectiveness analysis**: It is proved that the backdoor injected by DFBA is undetectable and unremovable under a variety of existing defense methods. - **Fine - tuning analysis**: It is proved that fine - tuning the backdoor classifier with clean data will not affect the parameters of the backdoor path. ### Experimental Evaluation - **Datasets**: Experiments were carried out on multiple benchmark datasets, including MNIST, CIFAR - 10, etc. - **Models**: Evaluated models with different architectures, such as VGG, ResNet, etc. - **Comparison baselines**: Compared with existing state - of - the - art backdoor attack methods, demonstrating the superiority of DFBA. - **Defense methods**: Evaluated DFBA against six

Data Free Backdoor Attacks

B3: Backdoor Attacks Against Black-box Machine Learning Models

BAD-FM: Backdoor Attacks Against Factorization-Machine Based Neural Network for Tabular Data Prediction

DFB: A Data-Free, Low-Budget, and High-Efficacy Clean-Label Backdoor Attack

Rethinking Backdoor Attacks

Escaping Backdoor Attack Detection of Deep Learning

Parity measurements of nuclear levels using a free-electron-laser generated gamma-ray beam.

Beating Backdoor Attack at Its Own Game

Black-box Detection of Backdoor Attacks with Limited Information and Data

A Practical Trigger-Free Backdoor Attack on Neural Networks

Untargeted Backdoor Attack Against Object Detection

Mitigating Backdoor Attack by Injecting Proactive Defensive Backdoor

Watch Out! Simple Horizontal Class Backdoor Can Trivially Evade Defense

Reverse Engineering Imperceptible Backdoor Attacks on Deep Neural Networks for Detection and Training Set Cleansing

Backdoor Cleansing with Unlabeled Data

Inspecting Prediction Confidence for Detecting Black-Box Backdoor Attacks

Backdoor Learning: A Survey.

Backdoor Vulnerabilities in Normally Trained Deep Learning Models

BELT: Old-School Backdoor Attacks can Evade the State-of-the-Art Defense with Backdoor Exclusivity Lifting

Enhanced Coalescence Backdoor Attack Against DNN Based on Pixel Gradient