Data Sampling and Kernel Manifold Discriminant Alignment for Mixed-Project Heterogeneous Defect Prediction

Jingwen Niu,Zhiqiang Li,Haowen Chen,Xiwei Dong,Xiao-Yuan Jing
DOI: https://doi.org/10.1007/s11219-022-09588-z
2022-01-01
Software Quality Journal
Abstract:Heterogeneous defect prediction (HDP) refers to identifying more likely defect-proneness of software modules in a target project using heterogeneous metric data from other source projects, which solves the heterogeneous metric problem in cross-project defect prediction. Recently, several mixed-project HDP methods have been presented. However, these models neglect to address the linear inseparability and cross-project class imbalance issues simultaneously. These limitations usually lead to the unsatisfactory performance of HDP. In this paper, we propose an improved transfer learning approach for mixed-project HDP to deal with the above limitations, called data sampling and kernel manifold discriminant alignment (DSKMDA). DSKMDA firstly applies data sampling technique to handle the class imbalance issue. Then it uses kernel manifold discriminant alignment technique to handle the linear inseparability issue. Extensive experiments on 13 projects from three public benchmark datasets with four evaluation measures demonstrate that DSKMDA can produce better or comparable results against a range of competing methods.
What problem does this paper attempt to address?