Multi-target Backdoor Attacks for Code Pre-trained Models

Yanzhou Li,Shangqing Liu,Kangjie Chen,Xiaofei Xie,Tianwei Zhang,Yang Liu
DOI: https://doi.org/10.48550/arXiv.2306.08350
2023-06-14
Abstract:Backdoor attacks for neural code models have gained considerable attention due to the advancement of code intelligence. However, most existing works insert triggers into task-specific data for code-related downstream tasks, thereby limiting the scope of attacks. Moreover, the majority of attacks for pre-trained models are designed for understanding tasks. In this paper, we propose task-agnostic backdoor attacks for code pre-trained models. Our backdoored model is pre-trained with two learning strategies (i.e., Poisoned Seq2Seq learning and token representation learning) to support the multi-target attack of downstream code understanding and generation tasks. During the deployment phase, the implanted backdoors in the victim models can be activated by the designed triggers to achieve the targeted attack. We evaluate our approach on two code understanding tasks and three code generation tasks over seven datasets. Extensive experiments demonstrate that our approach can effectively and stealthily attack code-related downstream tasks.
Cryptography and Security,Artificial Intelligence,Computation and Language
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the problem of implanting backdoor attacks in code pre - training models. Specifically, most of the existing backdoor attacks against neural code models insert triggers during the fine - tuning stage to achieve attacks for specific tasks, which limits the scope of the attacks. In addition, most attack designs against pre - training models are for understanding tasks and lack support for generation tasks. Therefore, this paper proposes a task - agnostic backdoor attack framework, aiming to implant multiple backdoors in the code pre - training model during the pre - training stage. These backdoors can be activated in different downstream code understanding and generation tasks, thereby achieving multi - target attacks. The main contributions of the paper include: 1. Implanting backdoors in the pre - training stage of the code pre - training model for the first time. 2. Expanding the targets of backdoor attacks to generation tasks for the first time and proposing two pre - training strategies to support attacks on code understanding tasks and generation tasks. 3. Verifying the effectiveness of this attack method through extensive experiments, including its performance on five code - related downstream tasks, confirming its function preservation, attack effectiveness and stealthiness. This research not only reveals the security risks of code pre - training models, but also discusses various possible mitigation strategies to promote the safer use of these models.