A Practical Trigger-Free Backdoor Attack on Neural Networks

Jiahao Wang,Xianglong Zhang,Xiuzhen Cheng,Pengfei Hu,Guoming Zhang
2024-08-21
Abstract:Backdoor attacks on deep neural networks have emerged as significant security threats, especially as DNNs are increasingly deployed in security-critical applications. However, most existing works assume that the attacker has access to the original training data. This limitation restricts the practicality of launching such attacks in real-world scenarios. Additionally, using a specified trigger to activate the injected backdoor compromises the stealthiness of the attacks. To address these concerns, we propose a trigger-free backdoor attack that does not require access to any training data. Specifically, we design a novel fine-tuning approach that incorporates the concept of malicious data into the concept of the attacker-specified class, resulting the misclassification of trigger-free malicious data into the attacker-specified class. Furthermore, instead of relying on training data to preserve the model's knowledge, we employ knowledge distillation methods to maintain the performance of the infected model on benign samples, and introduce a parameter importance evaluation mechanism based on elastic weight constraints to facilitate the fine-tuning of the infected model. The effectiveness, practicality, and stealthiness of the proposed attack are comprehensively evaluated on three real-world datasets. Furthermore, we explore the potential for enhancing the attack through the use of auxiliary datasets and model inversion.
Cryptography and Security
What problem does this paper attempt to address?
### The Problem Addressed by This Paper This paper primarily addresses the issue of backdoor attacks in deep neural networks (DNNs) and proposes a new method that does not require triggers or access to training data. Specifically: 1. **Limitations of Existing Backdoor Attacks**: - Most existing backdoor attack methods assume that the attacker can access the original training data or data with the same distribution, which is not realistic in practical scenarios. - Using specific triggers to activate the injected backdoor reduces the stealthiness of the attack and makes it easier to be detected by existing defense techniques. 2. **Proposed New Method**: - A trigger-free backdoor attack method is proposed, which does not rely on any specific trigger pattern to activate the backdoor. - The method maintains the performance of the infected model on normal samples through knowledge distillation and parameter importance evaluation mechanisms, ensuring the model's usability. 3. **Specific Implementation**: - The pre-trained model is fine-tuned using malicious data, expanding the model's concept of the attacker-specified category so that malicious data can be misclassified into the attacker's specified category. - Knowledge distillation is used to maintain the performance of the infected model on normal samples, and an elastic weight constraint mechanism is introduced to evaluate the importance of each model parameter, preventing overfitting. 4. **Experimental Validation**: - Comprehensive evaluations were conducted on three real datasets, and the results show that the method can attack benign models with a high success rate while maintaining performance on normal samples. - The application of auxiliary datasets and model inversion techniques was further explored, and it was found that these techniques help improve the attack's effectiveness. In summary, this paper aims to overcome the limitations of existing backdoor attack methods and proposes a more practical and stealthy attack method.