Applying Adaptive Over-Sampling Technique Based on Data Density and Cost-Sensitive SVM to Imbalanced Learning.

Senzhang Wang,Zhoujun Li,Wenhan Chao,Qinghua Cao
DOI: https://doi.org/10.1109/ijcnn.2012.6252696
2011-01-01
Abstract:Resampling method is a popular and effective technique to imbalanced learning. However, most resampling methods ignore data density information and may lead to overfitting. A novel adaptive over-sampling technique based on data density (ASMOBD) is proposed in this paper. Compared with existing resampling algorithms, ASMOBD can adaptively synthesize different number of new samples around each minority sample according to its level of learning difficulty. Therefore, this method makes the decision region more specific and can eliminate noise. What's more, to avoid over generalization, two smoothing methods are proposed. Cost-Sensitive learning is also an effective technique to imbalanced learning. In this paper, ASMOBD and Cost-Sensitive SVM are combined. Experiments show that our methods perform better than various state-of-art approaches on 9 UCI datasets by using metrics of G-mean and area under the receiver operation curve (AUC).
What problem does this paper attempt to address?