A^3-CodGen: A Repository-Level Code Generation Framework for Code Reuse with Local-Aware, Global-Aware, and Third-Party-Library-Aware
Dianshu Liao,Shidong Pan,Xiaoyu Sun,Xiaoxue Ren,Qing Huang,Zhenchang Xing,Huan Jin,Qinying Li
DOI: https://doi.org/10.48550/arXiv.2312.05772
2024-03-05
Abstract:Code generation tools are essential to help developers in the software development process. Existing tools often disconnect with the working context, i.e., the code repository, causing the generated code to be not similar to human developers. In this paper, we propose a novel code generation framework, dubbed A^3-CodGen, to harness information within the code repository to generate code with fewer potential logical errors, code redundancy, and library-induced compatibility issues. We identify three categories of representative information for the code repository: local-aware information from current code file, global-aware information from other code files, and third-party-library information. Results demonstrate that by adopting the A^3-CodGen framework, we successfully extract, fuse, and feed code repository information into the LLM, generating more accurate, efficient, and highly reusable code. The effectiveness of our framework is further underscored by generating code with a higher reuse rate, compared to human developers. This research contributes significantly to the field of code generation, providing developers with a more powerful tool to address the evolving demands in software development in practice.
Software Engineering