CMO-SMOTE: Misclassification Cost Minimization Oriented Synthetic Minority Oversampling Technique for Imbalanced Learning

Changsheng Zhou,Bin Liu,Shihai Wang
DOI: https://doi.org/10.1109/ihmsc.2016.160
2016-01-01
Abstract:It is a troublesome issue that when performed on imbalanced data sets, most classification algorithms off the shelf actually do not behave well enough, manifesting itself as minorities which may be more desired to be correctly distinguished are contrarily worse (even dramatically) classified than their counterparts are. This paper gives a theoretical analysis on the underlying causes leading to the problem of imbalanced learning from the Bayesian perspective. And a new synthetic minority oversampling strategy is proposed, namely miscalssification Cost Minimization Oriented Synthetic Minority Oversampling TEchnique (CMO-SMOTE). Experiments on several imbalanced data sets from real world show that our method achieves satisfactory performance on five evaluation metrics.
What problem does this paper attempt to address?