Data Free Backdoor Attacks

Bochuan Cao,Jinyuan Jia,Chuxuan Hu,Wenbo Guo,Zhen Xiang,Jinghui Chen,Bo Li,Dawn Song
2024-12-09
Abstract:Backdoor attacks aim to inject a backdoor into a classifier such that it predicts any input with an attacker-chosen backdoor trigger as an attacker-chosen target class. Existing backdoor attacks require either retraining the classifier with some clean data or modifying the model's architecture. As a result, they are 1) not applicable when clean data is unavailable, 2) less efficient when the model is large, and 3) less stealthy due to architecture changes. In this work, we propose DFBA, a novel retraining-free and data-free backdoor attack without changing the model architecture. Technically, our proposed method modifies a few parameters of a classifier to inject a backdoor. Through theoretical analysis, we verify that our injected backdoor is provably undetectable and unremovable by various state-of-the-art defenses under mild assumptions. Our evaluation on multiple datasets further demonstrates that our injected backdoor: 1) incurs negligible classification loss, 2) achieves 100% attack success rates, and 3) bypasses six existing state-of-the-art defenses. Moreover, our comparison with a state-of-the-art non-data-free backdoor attack shows our attack is more stealthy and effective against various defenses while achieving less classification accuracy loss.
Cryptography and Security,Artificial Intelligence,Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to inject a backdoor attack into a pre - trained classifier without retraining or modifying the model architecture. Specifically, existing backdoor attack methods usually need to access some clean data to retrain the model or need to change the model architecture to inject the backdoor. These methods have the following problems: 1. **Unavailable data**: When no clean data is available, existing methods are not applicable. 2. **Low efficiency**: For large - scale models, retraining is very time - consuming and resource - intensive. 3. **Poor concealment**: Changing the model architecture may be detected, thus reducing the concealment of the attack. For this reason, the paper proposes a new data - free, no - retraining backdoor attack method (DFBA), which injects the backdoor by directly modifying the model parameters without accessing any data or changing the model architecture. The main contributions of the paper include: - **Proposing a data - free, no - retraining backdoor attack for the first time**: DFBA directly modifies the parameters of the classifier to inject the backdoor without accessing any data or changing the model architecture. - **Theoretical analysis**: It is proved that the backdoor injected by DFBA is undetectable and unremovable under a variety of existing defense methods. - **Comprehensive evaluation**: Experiments were carried out on multiple benchmark datasets, demonstrating the effectiveness and efficiency of DFBA. - **Resistance to existing defense methods**: Experiments prove that DFBA can bypass six state - of - the - art defense methods and is not sensitive to the selection of hyper - parameters. ### Paper Abstract The paper proposes DFBA, a new data - free, no - retraining backdoor attack method that can inject a backdoor into a pre - trained classifier without changing the model architecture. Through theoretical analysis, it is proved that the injected backdoor is undetectable and unremovable under a variety of existing defense methods. Experimental results show that DFBA can achieve a 100% attack success rate on multiple datasets, while the classification accuracy loss for clean test inputs is less than 3% and can bypass six state - of - the - art defense methods. Compared with existing non - data - free backdoor attacks, DFBA is more concealed and more effective. ### Main Contributions - **Data - free, no - retraining backdoor attack**: DFBA directly modifies the parameters of the classifier to inject the backdoor without accessing any data or changing the model architecture. - **Theoretical guarantee**: It is proved that the backdoor injected by DFBA is undetectable and unremovable under a variety of existing defense methods. - **Comprehensive evaluation**: Experiments were carried out on multiple benchmark datasets, demonstrating the effectiveness and efficiency of DFBA. - **Resistance to existing defense methods**: Experiments prove that DFBA can bypass six state - of - the - art defense methods and is not sensitive to the selection of hyper - parameters. ### Technical Details DFBA injects the backdoor through the following steps: 1. **Select neurons**: Select one neuron from each layer to form a backdoor path. 2. **Design backdoor switch**: Modify the parameters of the first neuron so that it is activated when there is a backdoor input and hardly activated when there is a clean input. 3. **Amplify the output of the backdoor switch**: By modifying the parameters of the intermediate - layer neurons, gradually amplify the output of the backdoor switch. 4. **Adjust the weights of the output layer**: Modify the weights of the output - layer neurons so that the output of the backdoor path has a positive contribution to the target class and a negative contribution to the non - target class. ### Theoretical Analysis - **Utility analysis**: It is proved that when the classifier with the backdoor injected by DFBA processes clean inputs, its output is the same as that of the pruned classifier, thus maintaining the classification accuracy. - **Effectiveness analysis**: It is proved that the backdoor injected by DFBA is undetectable and unremovable under a variety of existing defense methods. - **Fine - tuning analysis**: It is proved that fine - tuning the backdoor classifier with clean data will not affect the parameters of the backdoor path. ### Experimental Evaluation - **Datasets**: Experiments were carried out on multiple benchmark datasets, including MNIST, CIFAR - 10, etc. - **Models**: Evaluated models with different architectures, such as VGG, ResNet, etc. - **Comparison baselines**: Compared with existing state - of - the - art backdoor attack methods, demonstrating the superiority of DFBA. - **Defense methods**: Evaluated DFBA against six