Codexity: Secure AI-assisted Code Generation

Sung Yong Kim,Zhiyu Fan,Yannic Noller,Abhik Roychoudhury
2024-05-07
Abstract:Despite the impressive performance of Large Language Models (LLMs) in software development activities, recent studies show the concern of introducing vulnerabilities into software codebase by AI programming assistants (e.g., Copilot, CodeWhisperer). In this work, we present Codexity, a security-focused code generation framework integrated with five LLMs. Codexity leverages the feedback of static analysis tools such as Infer and CppCheck to mitigate security vulnerabilities in LLM-generated programs. Our evaluation in a real-world benchmark with 751 automatically generated vulnerable subjects demonstrates Codexity can prevent 60% of the vulnerabilities being exposed to the software developer.
Software Engineering
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the security vulnerabilities that may be introduced in the automatically generated code when using large - language models (LLMs) for programming assistance. Although LLMs perform well in software development activities, recent research shows that these AI programming assistants (such as GitHub Copilot and CodeWhisperer) may introduce security vulnerabilities in the generated code, and developers may overlook these issues. This may pose a threat to the security of the entire software system. To address this challenge, the author proposes a security - focused code generation framework named Codexity. Codexity mitigates security vulnerabilities in programs generated by LLMs by integrating five LLMs and using the feedback from static analysis tools (such as Infer and CppCheck). Specifically, the workflow of Codexity is as follows: 1. **User selects repair strategy**: The user first needs to select a repair strategy in the configuration settings to activate the system. Codexity currently offers two strategies: Iteration Repair and Preshot Repair to meet different computational resource requirements. 2. **Generate initial code**: The user can call Codexity to complete their code. Codexity will generate an initial code completion based on the existing code snippets. 3. **Vulnerability detection**: The generated code will be routed to a series of static analysis tools for vulnerability detection. Codexity integrates two state - of - the - art static analyzers, CppCheck and Infer, to check for various types of vulnerabilities. 4. **Feedback and correction**: If the static analysis tools report any vulnerabilities, Codexity will extract the error / warning information, location information, and the vulnerable code to form a prompt containing vulnerability information. Then, Codexity will send this prompt to the LLM in the background and request to generate a vulnerability - free program. The paper experimentally evaluates the effectiveness of Codexity, and the results show that Codexity can prevent 60% of vulnerabilities from being exposed to software developers. In addition, the paper also compares the performance of Codexity with FootPatch and GitHub Copilot and explores the advantages and disadvantages of different repair strategies.