Joint Domain Adaption and Pseudo-Labeling for Cross-Project Defect Prediction

Fei Wu,Xinhao Zheng,Ying Sun,Yang Gao,Xiao-Yuan Jing
DOI: https://doi.org/10.1587/transinf.2021edl8061
2022-01-01
IEICE Transactions on Information and Systems
Abstract:Cross-project defect prediction (CPDP) is a hot research topic in recent years. The inconsistent data distribution between source and target projects and lack of labels for most of target instances bring a challenge for defect prediction. Researchers have developed several CPDP methods. However, the prediction performance still needs to be improved. In this paper, we propose a novel approach called Joint Domain Adaption and Pseudo-Labeling (JDAPL). The network architecture consists of a feature mapping sub-network to map source and target instances into a common subspace, followed by a classification sub-network and an auxiliary classification sub-network. The classification sub-network makes use of the label information of labeled instances to generate pseudo-labels. The auxiliary classification sub-network learns to reduce the distribution difference and improve the accuracy of pseudo-labels for unlabeled instances through loss maximization. Network training is guided by the adversarial scheme. Extensive experiments are conducted on 10 projects of the AEEEM and NASA datasets, and the results indicate that our approach achieves better performance compared with the baselines.
What problem does this paper attempt to address?