Abstract:Automatic programming attempts to minimize human intervention in the generation of executable code, and has been a long-standing challenge in the software engineering community. To advance automatic programming, researchers are focusing on three primary directions: (1) code search that reuses existing code snippets from external databases; (2) code generation that produces new code snippets from natural language; and (3) program repair that refines existing code snippets by fixing detected bugs. Despite significant advancements, the effectiveness of state-of-the-art techniques is still limited, such as the usability of searched code and the correctness of generated code. Motivated by the real-world programming process, where developers usually use various external tools to aid their coding processes, such as code search engines and code testing tools, in this work, we propose \toolname{}, an automatic programming framework that leverages recent large language models (LLMs) to integrate the three research areas to address their inherent limitations. In particular, our framework first leverages different code search strategies to retrieve similar code snippets, which are then used to further guide the code generation process of LLMs. Our framework further validates the quality of generated code by compilers and test cases, and constructs repair prompts to query LLMs for generating correct patches. We conduct preliminary experiments to demonstrate the potential of our framework, \eg helping CodeLlama solve 267 programming problems with an improvement of 62.53\%. As a generic framework, \toolname{} can integrate various code search, generation, and repair tools, combining these three research areas together for the first time. More importantly, it demonstrates the potential of using traditional SE tools to enhance the usability of LLMs in automatic programming.

What problem does this paper attempt to address?

The paper aims to address the challenges in the field of automatic programming, particularly the limitations in the three main directions of code search, code generation, and program repair. Specifically: 1. **Code Search**: Existing code search techniques can find relevant code snippets, but these codes often need manual adjustments to fit the specific needs of a project, consuming a lot of developers' time. 2. **Code Generation**: Code generation techniques based on natural language often struggle to produce syntactically correct code that can pass compilation and test cases. Even the latest large language models (LLMs) frequently generate code with errors and vulnerabilities. 3. **Program Repair**: Current research mainly focuses on fixing semantic errors found in functional tests, neglecting the quality issues of automatically generated code. Additionally, the code generated by LLMs lacks a dynamic feedback mechanism from external verification tools (such as compilers), leading to insufficient reliability of the generated code. To address the above issues, the paper proposes a framework called Cream, which combines the strengths of the three research areas of code search, code generation, and program repair, leveraging the capabilities of large language models (LLMs) to integrate the work in these three directions. Through this integration, Cream aims to overcome the inherent limitations of each field and enhance the overall effectiveness and practicality of automatic programming. Specifically, Cream's workflow includes the following steps: - **Code Search Phase**: Query relevant code repositories based on requirements to retrieve similar code snippets. - **Code Generation Phase**: Use LLMs to generate code based on the retrieved code snippets and the original requirements. - **Program Repair Phase**: Utilize a dynamic feedback mechanism to further optimize the generated code, ensuring it meets the expected functional requirements. In this way, Cream not only expands the application scope and capabilities of existing technologies in automatic programming but also demonstrates how traditional software engineering tools can enhance the practicality of LLMs in automatic programming.

No Man is an Island: Towards Fully Automatic Programming by Code Search, Code Generation and Program Repair

Automatic Programming: Large Language Models and Beyond

Natural Language to Code: How Far Are We?

Natural Language-Guided Programming

ProgAI: Enhancing Code Generation with LLMs For Real World Challenges

ANPL: Towards Natural Programming with Interactive Decomposition

Planning-Driven Programming: A Large Language Model Programming Workflow

AutoCodeRover: Autonomous Program Improvement

SuperCoder2.0: Technical Report on Exploring the feasibility of LLMs as Autonomous Programmer

Teaching Code LLMs to Use Autocompletion Tools in Repository-Level Code Generation

Natural Language Generation and Understanding of Big Code for AI-Assisted Programming: A Review

Experimenting a New Programming Practice with LLMs

Assured Automatic Programming via Large Language Models

A Pair Programming Framework for Code Generation Via Multi-Plan Exploration and Feedback-Driven Refinement

A Chain of AI-based Solutions for Resolving FQNs and Fixing Syntax Errors in Partial Code

A Comprehensive Survey of AI-Driven Advancements and Techniques in Automated Program Repair and Code Generation

Improving Natural Language Capability of Code Large Language Model

A Unified Debugging Approach via LLM-Based Multi-Agent Synergy

Code Generation and Algorithmic Problem Solving Using Llama 3.1 405B