What problem does this paper attempt to address?

The problem that this paper attempts to solve is that in code generation, although existing pre - trained code language models (code LMs) perform excellently in the accuracy of one - time prediction, they have obvious deficiencies in self - refinement. Specifically, when the code generated by these models fails the test cases, they have difficulty in effectively self - correcting errors based on execution feedback. This causes developers to face difficulties in debugging and fixing when using the code generated by these models, especially in the exploration mode, that is, when developers face unclear or not fully defined requirements. The paper proposes a framework named Cycle, aiming to enhance the self - correction ability of code language models by using execution feedback, thereby improving their performance in the exploration mode. ### Main contributions of the paper 1. **Revealing the weaknesses of code language models**: The paper points out that existing code language models perform poorly in understanding execution feedback and self - correcting errors. 2. **Proposing the Cycle framework**: This framework teaches code language models how to self - correct by jointly focusing on natural language problem descriptions, error - prone code generated by the model, and execution feedback. 3. **Data collection and training strategies**: The paper designs an automated data generation method to construct a data set specifically for self - correction training, and proposes a training strategy to enable the model to learn self - correction more effectively. 4. **Experimental verification**: The paper conducts extensive experiments on three popular code - generation benchmark data sets, and the results show that the Cycle framework significantly improves the performance of code generation, especially in terms of self - correction. ### Technical details of the paper 1. **Data preparation stage**: - **Fine - tuning the code language model**: First, use the verified correct code to fine - tune the pre - trained code language model to reduce the risk of the model generating error - prone code. - **Prompting the code language model to expose weaknesses**: By prompting the fine - tuned model to generate code and execute test cases, collect error - prone code and its execution feedback to construct training samples. 2. **Learning the self - correction stage**: - **Aggregating information**: Design a template to aggregate the problem description, error - prone code, and execution feedback together as the input of the model. - **Self - correction learning**: The model gradually improves its self - correction ability by learning to predict the correct code solution. - **Past - Generated Mask (PGM)**: To prevent the model from simply copying error - prone code during the training process, the past - generated mask technique is introduced, making the model more inclined to truly understand and correct errors. 3. **Self - correction as iterative programming**: - **Automated workflow**: Deploy the learned model, automatically generate code according to the problem description, and automatically verify and correct the code through test cases, simulating the iterative programming practice of human developers. ### Conclusion The paper significantly improves the performance of code language models in self - correction through the Cycle framework, especially in the exploration mode, providing developers with a more powerful tool to help them generate and debug code more efficiently.

CYCLE: Learning to Self-Refine the Code Generation

A Self-Iteration Code Generation Method Based on Large Language Models

Fixing Code Generation Errors for Large Language Models

Self-Edit: Fault-Aware Code Editor for Code Generation

Rethinking Code Refinement: Learning to Judge Code Efficiency

CodeLutra: Boosting LLM Code Generation via Preference-Guided Refinement

Teaching Large Language Models to Self-Debug

Training LLMs to Better Self-Debug and Explain Code

StepCoder: Improve Code Generation with Reinforcement Learning from Compiler Feedback

StepCoder: Improving Code Generation with Reinforcement Learning from Compiler Feedback

An Empirical Study on Self-correcting Large Language Models for Data Science Code Generation

SEED: Customize Large Language Models with Sample-Efficient Adaptation for Code Generation

RGD: Multi-LLM Based Agent Debugger via Refinement and Generation Guidance

An Empirical Study of Code Generation Errors made by Large Language Models

LLM-Assisted Code Cleaning For Training Accurate Code Generators

ProgCo: Program Helps Self-Correction of Large Language Models

Is Self-Repair a Silver Bullet for Code Generation?

Iterative Refinement of Project-Level Code Context for Precise Code Generation with Compiler Feedback

Self-Refine: Iterative Refinement with Self-Feedback

ROCODE: Integrating Backtracking Mechanism and Program Analysis in Large Language Models for Code Generation