A Classfication Method For Imbalance Data Set Based on Kernel SMOTE

ZENG Zhi-qiang,WU Qun,LIAO Bei-shui,GAO Ji
2009-01-01
Abstract:An approach based on kernel SMOTE(Synthetic Minority Over-sampling Technique) to solve classification on imbalance data set by Support Vector Machine(SVM) is presented.The method first oversamples the minority class in feature space by kernel SMOTE algorithm,then the pre-images of the synthetic instances are found based on a distance relation between feature space and input space.Finally,these pre-images are appended to the original data set to train a SVM.Experiments on real data sets indicate that compared with SMOTE approach,the samples constructed by the kernel SMOTE algorithm have the higher quality.As a result,the effectiveness of classification by SVM on imbalance data set is improved.
What problem does this paper attempt to address?