Over-sampling Algorithm Based on Preliminary Classification in Imbalanced Data Sets Learning

HAN Hui,WANG Lu,WEN Ming,WANG Wen-yuan
2006-01-01
Abstract:To significantly improve the classification performance of the minority class,an over-sampling algorithm based on preliminary classification was presented.Firstly,preliminary classification was made on the test data in order to save the useful information of the majority class as much as possible.Then the test data that were predicted to belong to minority class were reclassified to improve the classification performance of the minority class.Using the data sets provided by University of California,Irvine,the new algorithm was compared with synthetic minority over-sampling technique and under-sampling method.The experimental results show that the new algorithm performs better than the others in terms of the classification performance of the minority class and majority class.
What problem does this paper attempt to address?