ℓ2,1 Norm Regularized Multi-Kernel Based Joint Nonlinear Feature Selection and Over-Sampling for Imbalanced Data Classification

Peng Cao,Xiaoli Liu,Jian Zhang,Dazhe Zhao,Min Huang,Osmar Zaiane
DOI: https://doi.org/10.1016/j.neucom.2016.12.036
IF: 6
2017-01-01
Neurocomputing
Abstract:High dimensionality and classification of imbalanced data sets are two of the most interesting machine learning challenges. Both issues have been independently studied in the literature. In order to simultaneously explore the both issues of feature selection and oversampling, we efficiently combine two different methodological approaches in an unified kernel framework. Specifically, we proposed a novel ℓ2,1 norm balanced multiple kernel feature selection (ℓ2,1 MKFS), and designed a proximal based optimization algorithm for efficiently learning the model. Moreover, multiple kernel oversampling (MKOS) was developed to generate synthetic instances in the optimal kernel space induced by ℓ2,1 MKFS, so as to compensate for the class imbalanced distribution. Our experimental results on multiple UCI data and two real medical application demonstrate that jointly operating nonlinear feature selection and oversampling with ℓ2,1 norm multi-kernel learning framework (ℓ2,1 MKFSOS) can lead to a promising classification performance.
What problem does this paper attempt to address?