Abstract:Fault localization (FL) and automated program repair (APR) are two main tasks of automatic software debugging. Compared with traditional methods, deep learning-based approaches have been demonstrated to achieve better performance in FL and APR tasks. However, the existing deep learning-based FL methods ignore the deep semantic features or only consider simple code representations. And for APR tasks, existing template-based APR methods are weak in selecting the correct fix templates for more effective program repair, which are also not able to synthesize patches via the embedded end-to-end code modification knowledge obtained by training models on large-scale bug-fix code pairs. Moreover, in most of FL and APR methods, the model designs and training phases are performed separately, leading to ineffective sharing of updated parameters and extracted knowledge during the training process. This limitation hinders the further improvement in the performance of FL and APR tasks. To solve the above problems, we propose a novel approach called MTL-TRANSFER, which leverages a multi-task learning strategy to extract deep semantic features and transferred knowledge from different perspectives. First, we construct a large-scale open-source bug datasets and implement 11 multi-task learning models for bug detection and patch generation sub-tasks on 11 commonly used bug types, as well as one multi-classifier to learn the relevant semantics for the subsequent fix template selection task. Second, an MLP-based ranking model is leveraged to fuse spectrum-based, mutation-based and semantic-based features to generate a sorted list of suspicious statements. Third, we combine the patches generated by the neural patch generation sub-task from the multi-task learning strategy with the optimized fix template selecting order gained from the multi-classifier mentioned above. Finally, the more accurate FL results, the optimized fix template selecting order, and the expanded patch candidates are combined together to further enhance the overall performance of APR tasks. Our extensive experiments on widely-used benchmark Defects4J show that MTL-TRANSFER outperforms all baselines in FL and APR tasks, proving the effectiveness of our approach. Compared with our previously proposed FL method TRANSFER-FL (which is also the state-of-the-art statement-level FL method), MTL-TRANSFER increases the faults hit by 8/11/12 on Top-1/3/5 metrics (92/159/183 in total). And on APR tasks, the number of successfully repaired bugs of MTL-TRANSFER under the perfect localization setting reaches 75, which is 8 more than our previous APR method TRANSFER-PR. Furthermore, another experiment to simulate the actual repair scenarios shows that MTL-TRANSFER can successfully repair 15 and 9 more bugs (56 in total) compared with TBar and TRANSFER, which demonstrates the effectiveness of the combination of our optimized FL and APR components.

Improving Fault Localization and Program Repair with Deep Semantic Features and Transferred Knowledge

Unifying Defect Prediction, Categorization, and Repair by Multi-Task Deep Learning

MTL-TRANSFER: Leveraging Multi-task Learning and Transferred Knowledge for Improving Fault Localization and Program Repair

Context-based Transfer Learning for Structuring Fault Localization and Program Repair Automation

Can Automated Program Repair Refine Fault Localization? A Unified Debugging Approach

Template-based Neural Program Repair.

Can Automated Program Repair Refine Fault Localization?

Deep Semantic Feature Learning for Software Defect Prediction

DeepFD: Automated Fault Diagnosis and Localization for Deep Learning Programs

Software Fault Localization Based on Multi-objective Feature Fusion and Deep Learning

SDP-MTF: A Composite Transfer Learning and Feature Fusion for Cross-Project Software Defect Prediction

TransplantFix: Graph Differencing-based Code Transplantation for Automated Program Repair.

Beep: Fine-grained Fix Localization by Learning to Predict Buggy Code Elements

A Deep Dive into Large Language Models for Automated Bug Localization and Repair

Deep Just-In-Time Defect Localization

InferFix: End-to-End Program Repair with LLMs

A Survey of Software Defects Research Based on Deep Learning

Using the Deep Learning-Based Approaches for Program Debugging and Repair

Fault Localization Based on Wide & Deep Learning Model by Mining Software Behavior.

Cross-Project Defect Prediction Using Transfer Learning with Long Short-Term Memory Networks

Crex: Predicting patch correctness in automated repair of C programs through transfer learning of execution semantics