An unsupervised defect prediction method based on probability

Zheng-fa LU,Ling XU,Xiao-hong ZHANG,Lin CHEN,Meng-ning YANG
DOI: https://doi.org/10.3969/j.issn.1007-130X.2017.05.013
2017-01-01
Abstract:Software defect prediction can improve the efficiency of software development and testing to ensure software quality.Unsupervised defect prediction methods can be quickly applied to engineering practice as they do not need labeled data.We propose an unsupervised defect prediction method (probabilistic clustering and labeling,PCLA) based on probability.This method evaluates the probability of the class's defect by mapping the difference of the metric value and its threshold to probability,and then predicts class by clustering and labeling,which can solve the problem of information loss caused by the existing unsupervised methods that are sensitive to the threshold when they compare metric value with its threshold directly.The PCLA method is applied to seven datasets of NetGen and Relink.Experimental results show that the PCLA method has an average increase of 4.1%,2.52% and 3.14%,respectively in recall rate,precision and F-measure in comparison with the existing unsupervised methods.
What problem does this paper attempt to address?