Enhancing Cross-Project Just-In-Time Defect Prediction with Active Deep Learning

Yue Wang,Yong Li,Yuanyuan Ren,Junjie Yu
DOI: https://doi.org/10.1109/qrs-c60940.2023.00073
2023-01-01
Abstract:Cross-project just-in-time defect prediction modeling techniques can address the challenge of limited training data in within-project just-in-time defect prediction when dealing with new projects. Annotating data for new projects using data annotation algorithms often introduces noise, and manual labeling is expensive. Therefore, this paper proposes an active deep learning-based cross-project just-in-time software defect prediction method (ADLCross-JIT). Firstly, the method selects code changes with high similarity to the target project from the source project as the labeled samples using the Burak filter, while the target project serves as the unlabeled sample pool. Secondly, a small set of valuable uncertainty samples is selected from the unlabeled sample pool using data-driven active deep learning and provided to Oracle for annotation, and then the annotated samples are added to the labeled sample pool. Lastly, a low-cost data annotation model is constructed based on the labeled sample pool and employed for cross-project just-in-time software defect prediction. Experimental results demonstrate that the proposed method effectively enhances the performance of the cross-project just-in-time software defect prediction model.
What problem does this paper attempt to address?