SVM Classifier for Unbalanced Data Based on Spectrum Cluster-Based Under-Sampling Approaches

陶新民,张冬雪,郝思媛,付丹丹
DOI: https://doi.org/10.13195/j.cd.2012.12.4.taoxm.020
2012-01-01
Abstract:An under-sampling unbalanced dataset support vector machine(SVM) algorithm based on spectrum cluster is presented.Majority instances are clustered by using spectrum cluster in kernel space for resampling reprentative samples with cluster information.The number of selected samples in each cluster is dependent on the size of each cluster and the distance of the cluster to the all minority instances,which can not only reduce the number of majority instances,but also the SVM classification performance under unbalanced dataset is improved by using the proposed method.In the experiments,the proposed approach is compared with other data-preprocess methods for unbalanced dataset classification.The experimental results show that the proposed method can not only improve classification performance of SVM algorithm in the minority class data,but also increase the overall classification performance and effectivity.
What problem does this paper attempt to address?