A Survey of Different Approaches for the Class Imbalance Problem in Software Defect Prediction

Abdul Waheed Dar,Sheikh Umar Farooq
DOI: https://doi.org/10.4018/ijssci.301268
2022-06-03
International Journal of Software Science and Computational Intelligence
Abstract:The imbalanced nature of the software datasets leads to the biased learning of prediction model toward the observations of the majority class (non-defective class). The prediction model can produce poor results for the minority class observations. Such misappropriations can prove costly especially in software development where minority class (defective) is the one that has the highest interest from the learning point of view. Various approaches have been used for dealing with class imbalance problem of software defect prediction but no one dominates and hence developing a generalized software defect prediction model for imbalanced datasets remains problematic. This paper surveys existing approaches for handling class imbalance problem of software defect datasets. In this survey, most relevant software defect prediction studies and identified the two main approaches that have been used for handling imbalance issue of software defect datasets. Furthermore, we also provide some comparison of findings in state-of-the-art literature and the guidelines for carrying out future research.
What problem does this paper attempt to address?