Abstract:Providing personalized and timely feedback for student's programming assignments is useful for programming education. Automated program repair (APR) techniques have been used to fix the bugs in programming assignments, where the Large Language Models (LLMs) based approaches have shown promising results. Given the growing complexity of identifying and fixing bugs in advanced programming assignments, current fine-tuning strategies for APR are inadequate in guiding the LLM to identify bugs and make accurate edits during the generative repair process. Furthermore, the autoregressive decoding approach employed by the LLM could potentially impede the efficiency of the repair, thereby hindering the ability to provide timely feedback. To tackle these challenges, we propose FastFixer, an efficient and effective approach for programming assignment repair. To assist the LLM in accurately identifying and repairing bugs, we first propose a novel repair-oriented fine-tuning strategy, aiming to enhance the LLM's attention towards learning how to generate the necessary patch and its associated context. Furthermore, to speed up the patch generation, we propose an inference acceleration approach that is specifically tailored for the program repair task. The evaluation results demonstrate that FastFixer obtains an overall improvement of 20.46% in assignment fixing when compared to the state-of-the-art baseline. Considering the repair efficiency, FastFixer achieves a remarkable inference speedup of 16.67 times compared to the autoregressive decoding algorithm.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is: how to provide personalized and timely feedback for programming assignments, especially in complex advanced programming assignments, and the challenges faced by Automatic Program Repair (APR) techniques in identifying and fixing errors. Specifically, the existing methods have the following problems: 1. **Insufficient fine - tuning strategies**: The existing fine - tuning strategies are not sufficient to guide large - language models (LLMs) to accurately identify and fix errors in programming assignments. 2. **Low efficiency of autoregressive decoding**: The autoregressive decoding method adopted by LLMs is inefficient, which affects the repair speed and thus hinders the provision of timely feedback. To address these challenges, the authors propose a new method named FastFixer, aiming to improve the efficiency and effectiveness of programming assignment repair. The following are the main contributions of the paper: - **Proposing a new repair - oriented fine - tuning strategy**: By enhancing the LLMs' attention to generating necessary patches and their related contexts, it helps LLMs to more effectively identify and fix errors. - **Proposing an inference acceleration algorithm**: Specifically designed for program repair tasks, it uses defective code as a draft to accelerate the inference process. This is the first exploration of inference acceleration in LLM - based APR methods. - **Conducting a comprehensive evaluation**: In terms of repairing defective programs in advanced programming assignments, FastFixer performs excellently. It correctly repairs 312 programs on the Defects4DS dataset, with a 20.46% higher repair rate compared to the best - existing method, and the inference speed is increased by 16.67 times. ### Formula Summary 1. **Similarity Calculation Formula**: \[ \text{sim}(e, m)=\frac{1}{1 + \log(\text{dist}(e, m))}+1 \] where \(e\in Y_e\), \(m\in Y_m\), and \(\text{dist}\) is the Levenshtein distance between two statements. 2. **Weight Calculation Formula**: \[ \text{weight}(e)=\max\left(1,\sum_{m\in Y_m}\text{sim}(e, m)\right) \] 3. **Repair - Oriented Fine - Tuning Loss Function**: \[ L_{\text{ROFT}}=\sum_{i = 1}^{n}L_{\text{origin}}(q_i|X, q_1, q_2,\ldots, q_{i - 1})\cdot k_i \] where \(L_{\text{origin}}\) is the original cross - entropy loss, and \(k_i\) is the weight of each target statement \(q_i\) obtained from the modification - focused mask vector. Through these improvements, FastFixer not only improves the repair accuracy but also significantly enhances the repair speed, making it possible to provide timely feedback for complex programming assignments.

FastFixer: An Efficient and Effective Approach for Repairing Programming Assignments

The Future Can’t Help Fix the Past: Assessing Program Repair in the Wild

Peer-aided Repairer: Empowering Large Language Models to Repair Advanced Student Assignments

Confix: Combining Node-Level Fix Templates and Masked Language Model for Automatic Program Repair

FAPR: Fast and Accurate Program Repair for Introductory Programming Courses

RepairLLaMA: Efficient Representations and Fine-Tuned Adapters for Program Repair

Enhancing Code Language Models for Program Repair by Curricular Fine-tuning Framework

How Far Can We Go with Practical Function-Level Program Repair?

Hybrid Automated Program Repair by Combining Large Language Models and Program Analysis

ThinkRepair: Self-Directed Automated Program Repair

RePair: Automated Program Repair with Process-based Feedback

ContrastRepair: Enhancing Conversation-Based Automated Program Repair via Contrastive Test Case Pairs

Repairing Bugs in Python Assignments Using Large Language Models

InferFix: End-to-End Program Repair with LLMs

Exploring Parameter-Efficient Fine-Tuning of Large Language Model on Automated Program Repair

A Survey of Learning-based Automated Program Repair

Revisiting the Plastic Surgery Hypothesis via Large Language Models

CREF: An LLM-based Conversational Software Repair Framework for Programming Tutors

Shaping Program Repair Space with Existing Patches and Similar Code

Program Repair with Repeated Learning

Less Training, More Repairing Please: Revisiting Automated Program Repair via Zero-shot Learning