RePair: Automated Program Repair with Process-based Feedback

Yuze Zhao,Zhenya Huang,Yixiao Ma,Rui Li,Kai Zhang,Hao Jiang,Qi Liu,Linbo Zhu,Yu Su
2024-08-21
Abstract:The gap between the trepidation of program reliability and the expense of repairs underscores the indispensability of Automated Program Repair (APR). APR is instrumental in transforming vulnerable programs into more robust ones, bolstering program reliability while simultaneously diminishing the financial burden of manual repairs. Commercial-scale language models (LM) have taken APR to unprecedented levels. However, the emergence reveals that for models fewer than 100B parameters, making single-step modifications may be difficult to achieve the desired effect. Moreover, humans interact with the LM through explicit prompts, which hinders the LM from receiving feedback from compiler and test cases to automatically optimize its repair policies. In this literature, we explore how small-scale LM (less than 20B) achieve excellent performance through process supervision and feedback. We start by constructing a dataset named CodeNet4Repair, replete with multiple repair records, which supervises the fine-tuning of a foundational model. Building upon the encouraging outcomes of reinforcement learning, we develop a reward model that serves as a critic, providing feedback for the fine-tuned LM's action, progressively optimizing its policy. During inference, we require the LM to generate solutions iteratively until the repair effect no longer improves or hits the maximum step limit. The results show that process-based not only outperforms larger outcome-based generation methods, but also nearly matches the performance of closed-source commercial large-scale LMs.
Software Engineering,Computation and Language
What problem does this paper attempt to address?
The paper primarily addresses two key issues in the field of Automated Program Repair (APR): 1. **Multi-step Repair Problem**: The paper points out that existing large-scale language models typically adopt a one-shot approach when performing program repair, meaning they modify the erroneous code in a single step to achieve the repair. However, this method is challenging for smaller language models to achieve ideal results and does not align with the behavior pattern of human programmers who debug and improve programs step by step. Therefore, the paper proposes a process supervision-based method that allows the model to gradually repair the program through multiple steps. 2. **Process Feedback and Supervision**: In traditional methods, language models usually interact with users through explicit prompts, but this approach does not allow the model to directly obtain feedback from the compiler or test cases to optimize its repair strategy. To address this, the paper introduces a dataset containing detailed repair steps (CodeNet4Repair) and incorporates a reward model through reinforcement learning (RL) techniques as a "virtual tool." This model can evaluate the program's state and provide feedback to the language model, guiding it to gradually improve its repair strategy. In summary, the main contributions of this paper are the development of a new dataset (CodeNet4Repair) and a process supervision-based automated program repair framework (RePair), aimed at enabling small-scale language models to efficiently repair program errors in a step-by-step manner. This approach not only improves repair effectiveness but in some aspects can even rival the performance of commercial-grade large-scale language models.