A Novel Class-Imbalance Learning Approach for Both Within-Project and Cross-Project Defect Prediction.

Lina Gong,Shujuan Jiang,Lili Bo,Li Jiang,Junyan Qian
DOI: https://doi.org/10.1109/tr.2019.2895462
IF: 5.883
2019-01-01
IEEE Transactions on Reliability
Abstract:Software defect prediction (SDP) is an available way to enhance test efficiency and guarantee software reliability. However, there are more clean instances than defective instances in real software projects, and this results in severe class distribution skews and gets the poor performance of classifiers. So solving the class-imbalance problem in SDP has attracted growing attention from industry and academia in software engineering. In this paper, we propose a novel class-imbalance learning approach for both within-project and cross-project class-imbalance problem. We utilize the thought of stratification embedded in nearest neighbor (STr-NN) to produce evolving training datasets with balanced data. For within-project, we directly employ the STr-NN approach for defect prediction. For cross-project, we first introduce transfer component analysis to mitigate the distribution differences between source and target dataset, and then employ the STr-NN approach on the transferred data. We conduct experiments on PROMISE and NASA datasets using ensemble learning based on weight vote. Experimental results indicate that our approach has higher area under curve (AUC), Recall and comparable probability of a false alarm (pf), and F-measure than some existing methods for the class-imbalance problem.
What problem does this paper attempt to address?