AutoSafeCoder: A Multi-Agent Framework for Securing LLM Code Generation through Static Analysis and Fuzz Testing

Ana Nunez,Nafis Tanveer Islam,Sumit Kumar Jha,Peyman Najafirad
2024-09-17
Abstract:Recent advancements in automatic code generation using large language models (LLMs) have brought us closer to fully automated secure software development. However, existing approaches often rely on a single agent for code generation, which struggles to produce secure, vulnerability-free code. Traditional program synthesis with LLMs has primarily focused on functional correctness, often neglecting critical dynamic security implications that happen during runtime. To address these challenges, we propose AutoSafeCoder, a multi-agent framework that leverages LLM-driven agents for code generation, vulnerability analysis, and security enhancement through continuous collaboration. The framework consists of three agents: a Coding Agent responsible for code generation, a Static Analyzer Agent identifying vulnerabilities, and a Fuzzing Agent performing dynamic testing using a mutation-based fuzzing approach to detect runtime errors. Our contribution focuses on ensuring the safety of multi-agent code generation by integrating dynamic and static testing in an iterative process during code generation by LLM that improves security. Experiments using the SecurityEval dataset demonstrate a 13% reduction in code vulnerabilities compared to baseline LLMs, with no compromise in functionality.
Software Engineering,Artificial Intelligence
What problem does this paper attempt to address?
The paper attempts to address the issue of security vulnerabilities when using large language models (LLMs) to automatically generate code. Although existing methods can generate functionally correct code, they often overlook runtime security issues, leading to potential security vulnerabilities in the generated code. Specifically, the paper points out that traditional program synthesis mainly focuses on functional correctness while ignoring dynamic security, which can have serious security implications at runtime. Therefore, this paper proposes a multi-agent framework, AutoSafeCoder, aimed at enhancing the security of LLM-generated code through static analysis and fuzz testing, ensuring that the code is not only functionally correct but also secure and free of vulnerabilities at runtime. The main contributions of the paper include: 1. **Introducing a new multi-agent system** that leverages large language models to generate autonomously secure code, combining feedback from static analysis and dynamic testing. 2. **Applying few-shot learning and contextual learning techniques**, enabling agents to effectively identify vulnerabilities in continuous feedback loops. 3. **Providing comprehensive evaluations**, demonstrating the efficiency and security of the collaborative code generation system, including both quantitative and qualitative assessment results. Through these methods, the paper aims to reduce security vulnerabilities in LLM-generated code, enhancing the security and reliability of software development. Experimental results show that compared to baseline LLMs, AutoSafeCoder can reduce code vulnerabilities by 13% without sacrificing functionality.