A Novel Differential Evolution-Clustering Hybrid Resampling Algorithm on Imbalanced Datasets

Leichen Chen,Zhihua Cai,Lu Chen,Qiong Gu
DOI: https://doi.org/10.1109/wkdd.2010.48
2010-01-01
Abstract:When dealing with the imbalanced datasets (IDS), the hyperplane of Support vector machine (SVM) tends to minority class (positive class), which causes low classification accuracy. Aiming at this problem, we propose a novel differential evolution-clustering hybrid resampling SVM algorithm (DEC-SVM). This algorithm utilizes the similar mutation and crossover operators of Differential Evolution (DE) for over-sampling to enlarge the ratio of positive samples, and then we apply clustering to the over-sampled training dataset as a data cleaning method for both classes, removing the redundant or noisy samples. Experimental results show that our method DEC-SVM performs better, compared with standard SVM, SMOTE-SVM and DE-SVM under the criterion of F-measure and ROC Area (AUC) upon ten different UCI standard datasets.
What problem does this paper attempt to address?