Clean-label backdoor attack and defense: An examination of language model vulnerability
Shuai Zhao,Xiaoyu Xu,Luwei Xiao,Jinming Wen,Luu Anh Tuan
DOI: https://doi.org/10.1016/j.eswa.2024.125856
IF: 8.5
2024-12-11
Expert Systems with Applications
Abstract:Prompt-based learning, a paradigm that creates a bridge between pre-training and fine-tuning stages, has proven to be highly effective concerning various NLP tasks, particularly in few-shot scenarios. However, such a paradigm is not immune to backdoor attacks. Textual backdoor attacks aim at implanting specific vulnerabilities into models by poisoning some of the training samples via the injection of triggers and the alteration of labels. This approach, though, has its drawbacks, such as unnatural language expressions due to the trigger and incorrect labeling of the poisoned samples. In this study, we introduce ProAttack , an innovative and efficient approach for executing clean-label backdoor attacks that employ the prompt as a trigger. Our approach eliminates the need for external triggers, and ensures correct labeling of poisoned samples, thereby enhancing the stealthy nature of the backdoor attack. Furthermore, we preliminarily explore defense strategies against clean-label backdoor attacks, utilizing the LoRA algorithm which involves minimal parameter updates. We execute comprehensive experiments in both rich-resource and few-shot settings across classification and radiology report summarization tasks. The results empirically validate the strong performance of ProAttack in the field of textual backdoor attacks. Remarkably, within the rich-resource settings for classification tasks, ProAttack outperforms other methods, achieving state-of-the-art attack success rates in the clean-label backdoor attack benchmark without utilizing external triggers. Additionally, the defense method effectively mitigates clean-label backdoor attacks while maintaining the performance of the model.
computer science, artificial intelligence,engineering, electrical & electronic,operations research & management science