CFPS: Collaborative filtering based source projects selection for cross-project defect prediction

Zhongbin Sun,Junqi Li,Heli Sun,Liang He
DOI: https://doi.org/10.1016/j.asoc.2020.106940
IF: 8.7
2021-01-01
Applied Soft Computing
Abstract:Software defect prediction aims at helping developers allocate existing resources by predicting defect-prone modules prior to the testing phase. In the past decade, cross-project defect prediction (CPDP) have gained more attention than within-project defect prediction (WPDP) as WPDP is usually inefficient with the scarcity of training data due to the absence of historical defect data. Currently most CPDP studies focus on selecting appropriate training instances for improving the performance of defect prediction while few studies pay attention to the selection of appropriate source projects. However, in practice, source projects selection is the basis and prerequisite of training instances selection as an increasing number of open source software defect data are now available. In present study, we propose a Collaborative Filtering based source Projects Selection (CFPS) method for cross-project defect prediction. For a given new project, the similarity between it and each historical project is firstly calculated and thus the corresponding similarity repository could be obtained. Then CFPS mines the applicability among historical projects for constructing an applicability repository. Finally, with the aforementioned applicability and similarity repository, the popular user-based collaborative filtering algorithm is employed to recommend the appropriate source projects for the given new project. In the experiment, we have empirically validated the importance and necessity of selecting appropriate source projects. Furthermore, the experimental results also demonstrate that the proposed CFPS method is feasible and effective. (C) 2020 Elsevier B.V. All rights reserved.
What problem does this paper attempt to address?