Backdooring Neural Code Search

Weisong Sun,Yuchen Chen,Guanhong Tao,Chunrong Fang,Xiangyu Zhang,Quanjun Zhang,Bin Luo

2023-06-12

Abstract:Reusing off-the-shelf code snippets from online repositories is a common practice, which significantly enhances the productivity of software developers. To find desired code snippets, developers resort to code search engines through natural language queries. Neural code search models are hence behind many such engines. These models are based on deep learning and gain substantial attention due to their impressive performance. However, the security aspect of these models is rarely studied. Particularly, an adversary can inject a backdoor in neural code search models, which return buggy or even vulnerable code with security/privacy issues. This may impact the downstream software (e.g., stock trading systems and autonomous driving) and cause financial loss and/or life-threatening incidents. In this paper, we demonstrate such attacks are feasible and can be quite stealthy. By simply modifying one variable/function name, the attacker can make buggy/vulnerable code rank in the top 11%. Our attack BADCODE features a special trigger generation and injection procedure, making the attack more effective and stealthy. The evaluation is conducted on two neural code search models and the results show our attack outperforms baselines by 60%. Our user study demonstrates that our attack is more stealthy than the baseline by two times based on the F1 score.

Software Engineering,Artificial Intelligence,Computation and Language

What problem does this paper attempt to address?

The problem that this paper attempts to solve is the security issue of neural code - search models, especially backdoor attacks against these models. Specifically, the author focuses on how to inject backdoors into neural code - search models, so that when developers use natural - language queries, code snippets with errors containing specific triggers or having security vulnerabilities can be preferentially recommended to developers. This kind of attack may lead to financial losses in downstream software (such as stock trading systems and self - driving systems) and even life - threatening accidents. The author achieves this goal by modifying variable names or function names, rather than inserting dead code as in existing work, making the attack more covert and effective. The main contribution of the paper is to propose a new backdoor - attack method - BADCODE, which effectively attacks neural code - search models through two stages: target - oriented trigger generation and backdoor injection. Experimental results show that BADCODE is not only more effective than existing attack methods, but also more covert and difficult for developers to identify.

Backdooring Neural Code Search

B3: Backdoor Attacks Against Black-box Machine Learning Models

You see what I want you to see: poisoning vulnerabilities in neural code search.

KerbNet: A QoE-aware Kernel-Based Backdoor Attack Framework

Eliminating Backdoors in Neural Code Models via Trigger Inversion

BadCS: A Backdoor Attack Framework for Code search

Stealthy Backdoor Attack for Code Models

CCBA: Code Poisoning-Based Clean-Label Covert Backdoor Attack Against DNNs

SGBA: A Stealthy Scapegoat Backdoor Attack Against Deep Neural Networks

CodePurify: Defend Backdoor Attacks on Neural Code Models via Entropy-based Purification

Sparse Backdoor Attack Against Neural Networks.

Stand-in Backdoor: A Stealthy and Powerful Backdoor Attack

Need for Speed: Taming Backdoor Attacks with Speed and Precision

PatchBackdoor: Backdoor Attack against Deep Neural Networks without Model Modification

Multi-target Backdoor Attacks for Code Pre-trained Models

Evading Backdoor Defenses: Concealing Genuine Backdoors Through Scapegoat Strategy

FDI: Attack Neural Code Generation Systems through User Feedback Channel

Defense-Resistant Backdoor Attacks Against Deep Neural Networks in Outsourced Cloud Environment

DeepPayload: Black-box Backdoor Attack on Deep Learning Models Through Neural Payload Injection