Software Defect Prediction Using Propositionalization Based Data Preprocessing: An Empirical Study

CholMyong Pak,Tiantian Wang,Xiaohong Su
DOI: https://doi.org/10.1109/ICDSBA.2018.00021
2018-01-01
Abstract:Data preprocessing can be used to improve classifier performance in classification problems. Software defect prediction is also one of classification problems, so it is needed to use data preprocessing for improving the performance of model. In this paper, we study about the software defect prediction using propositonalization based data preprocessing method. We proposed propositionalization using decision tree as data preprocessing method and made experiments by using common classifiers over 17 datasets from the PROMISE repository. We also used paired t-test to compare propositionalization using decision tree with attribute subset selection and principal component analysis. Results showed that Propostionalization using decision tree improved the performance of software defect prediction significantly and it was more effective than attribute subset selection and principal component analysis. There were no statistically significant differences between top 5 classifiers.
What problem does this paper attempt to address?