A Novel Backdoor Scenario Target the Vulnerability of Prompt-as-a-Service for Code Intelligence Models

Yujie Fang,Zhiyong Feng,Guodong Fan,Shizhan Chen
DOI: https://doi.org/10.1109/icws62655.2024.00138
2024-01-01
Abstract:With the rapid development of prompt tuning technologies and the emergence of prompt-as-a-service platforms, the security of prompt service has attracted the attention of researchers. Deep neural networks face susceptibility to various adversaries. A particularly malicious altering model behavior is backdoor attack, where model predictions exhibit divergence in the presence of specific triggers in inputs. We concentrate on backdoor attacks in code intelligence models. In this paper, we aim to propose a novel backdoor attack scenario target the security of prompt service. In traditional backdoor scenario, the victim model is fine-tuned on the downstream poisoned dataset to establish a shortcut between the trigger and the target label. This scenario suffers from flaws such as rare words inserted into code snippets resulting in alterations to the code’ s semantics. And only those attackers who are familiar with specific triggers can launch an attack. Instead of introducing extra rare words to code snippets, our attack scenario choose to employ prompt itself as the triggers. We evaluate our methods on three popular code intelligence tasks, including code defect detection, clone detection and code summarization. Experimental results indicate that our methods can achieve an attack success rate of 95% to 99% with almost no sacrifice in accuracy on the original task, exhibiting an average performance degradation of less than 1%. We hope exposed potential security risks hidden in prompt service can raise awareness among researchers.
What problem does this paper attempt to address?