Stochastic Sensitivity Oversampling Technique for Imbalanced Data.

Tongwen Rong,Huachang Gong,Wing W. Y. Ng
DOI: https://doi.org/10.1007/978-3-662-45652-1_18
2014-01-01
Abstract:Data level technique is proved to be effective in imbalance learning. The SMOTE is a famous oversampling technique generating synthetic minority samples by linear interpolation between adjacent minorities. However, it becomes inefficiency for datasets with sparse distributions. In this paper, we propose the Stochastic Sensitivity Oversampling (SSO) which generates synthetic samples following Gaussian distributions in the Q-union of minority samples. The Q-union is the union of Q-neighborhoods (hypercubes centered at minority samples) and such that new samples are synthesized around minority samples. Experimental results show that the proposed algorithm performs well on most of datasets, especially those with a sparse distribution.
What problem does this paper attempt to address?