Abstract:Correcting bugs using modern Automated Program Repair (APR) can be both time-consuming and resource-expensive. We describe a program repair approach that aims to improve the scalability of modern APR tools. The approach leverages program reduction in the form of program slicing to eliminate code irrelevant to fixing the bug, which improves the APR tool's overall performance. We investigate slicing's impact on all three phases of the repair process: fault localization, patch generation, and patch validation. Our empirical exploration finds that the proposed approach, on average, enhances the repair ability of the TBar APR tool, but we also discovered a few cases where it was less successful. Specifically, on examples from the widely used Defects4J dataset, we obtain a substantial reduction in median repair time, which falls from 80 minutes to just under 18 minutes. We conclude that program reduction can improve the performance of APR without degrading repair quality, but this improvement is not universal. A replication package is available via Zenodo at <a class="link-external link-https" href="https://doi.org/10.5281/zenodo.13074333" rel="external noopener nofollow">this https URL</a>. Keywords: automated program repair, dynamic program slicing, fault localization, test-suite reduction, hybrid techniques.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is: **the scalability issue of Automated Program Repair (APR) when dealing with large - scale programs**. Specifically, modern APR tools are both time - consuming and resource - intensive when fixing program bugs, especially in the two stages of fault localization and patch verification. The paper proposes a method to improve the performance of APR tools through program reduction, especially program slicing, aiming to eliminate the code irrelevant to bug fixing and thus enhance the overall performance of APR tools. ### Detailed Explanation 1. **Problem Background**: - **Automated Program Repair (APR)** is an active research area, but existing APR tools face serious scalability issues when dealing with large - scale programs. - APR is usually divided into three stages: fault localization, patch generation, and patch verification. Among them, the fault localization and patch verification stages are particularly time - consuming and resource - intensive. 2. **Research Motivation**: - The paper points out that repairing a reduced program instead of the original program can bring benefits in all three stages. For example: - **Fault Localization**: Fault localization in the reduced program can be more focused, reducing the interference of irrelevant code. - **Patch Generation**: The reduced program is more cohesive, and the success rate of sampling potential patches from it is higher. - **Patch Verification**: Running the simplified program can reduce the time of patch verification, especially when combined with the test - suite reduction technique based on the reduced program. 3. **Experimental Design**: - The paper uses TBar, a template - based APR tool, and selects multiple bugs in the Defects4J dataset for experiments. - The experiment evaluates the impact of program reduction on the three stages of APR and introduces a new metric NTE (Number of Test Executions) to measure the improvement in patch verification efficiency. 4. **Main Contributions**: - **Empirical Research**: The paper proves the impact of program reduction on the three stages of APR through empirical research. - **Test - Suite Reduction**: Proposes a dynamic test - suite reduction method based on program reduction. - **Performance Improvement**: The results show that on the Defects4J dataset, the running time of TBar is significantly reduced, while the repair quality has not declined. 5. **Conclusion**: - Program reduction can effectively improve the performance of APR tools by eliminating code and tests irrelevant to repair without affecting the repair quality. However, this improvement is not universal, and there may be exceptions in some cases. ### Formula Representation Although this article does not involve complex mathematical formulas, some measurement indicators are involved in the description, such as: - **SLoC** (Source Lines of Code): the number of non - comment, non - blank code lines. - **TSS** (Test - Suite Size): the size of the test suite. - **BR** (Bug Rank): the rank of the bug statement in the list of suspicious locations. - **RT** (Repair Time): the repair time. - **NPC** (Number of Patch Candidates): the number of patch attempts before finding a valid patch. - **NTE** (Number of Test Executions): the number of test executions before finding a patch. These measurement indicators help evaluate the specific impact of program reduction on the performance of APR tools.

The Impact of Program Reduction on Automated Program Repair

The Future Can’t Help Fix the Past: Assessing Program Repair in the Wild

On The Effectiveness of Dynamic Reduction Techniques in Automated Program Repair

Shaping Program Repair Space with Existing Patches and Similar Code

Hybrid Automated Program Repair by Combining Large Language Models and Program Analysis

A critical review on the evaluation of automated program repair systems

A Comprehensive Study of Code-removal Patches in Automated Program Repair

Towards Practical and Useful Automated Program Repair for Debugging

Towards Extending the Range of Bugs That Automated Program Repair Can Handle

High-Quality Automated Program Repair

Less Training, More Repairing Please: Revisiting Automated Program Repair via Zero-shot Learning

You Cannot Fix What You Cannot Find! An Investigation of Fault Localization Bias in Benchmarking Automated Program Repair Systems

RePair: Automated Program Repair with Process-based Feedback

ThinkRepair: Self-Directed Automated Program Repair

How Far Can We Go with Practical Function-Level Program Repair?

ExpressAPR: Efficient Patch Validation for Java Automated Program Repair Systems

Accelerating Patch Validation for Program Repair With Interception-Based Execution Scheduling

Energy Consumption of Automated Program Repair

Revisiting the Plastic Surgery Hypothesis via Large Language Models

Reliable Fix Patterns Inferred from Static Checkers for Automated Program Repair