Support Vector based Oversampling Technique for Handling Class Imbalance in Software Defect Prediction

Ruchika Malhotra,Vaibhav Agrawal,Vedansh Pal,Tushar Agarwal
DOI: https://doi.org/10.1109/confluence51648.2021.9377068
2021-01-28
Abstract:The importance of software defect prediction has risen significantly over the past decade and is an inseparable part of software quality. Owing to the drawbacks of traditional techniques used for software quality assurance, machine learning algorithms are used to detect the defect in software modules. By the means of this study, we present a support vector based oversampling technique as a part of the software defect procedure and compare it with two other oversampling techniques namely Synthetic minority oversampling technique and Adaptive Synthetic oversampling technique. For the purpose of this study, we chose 5 datasets from the PROMISE and AEEEM repositories. After extracting a subset of attributes from the original dataset through Linear discriminant Analysis, we utilize the improved oversampled dataset to train a support vector machine classifier and a Naïve Bayesian classifier. The proposed Support Vector based oversampling technique along with Linear Discriminant Analysis performs better than the other techniques for the performance evaluation metric of F-measure score and the area under receiver operating characteristic curve and the consistency of result is maintained.
What problem does this paper attempt to address?