A Feature Selection Framework Based on Supervised Data Clustering

Hongzhi Liu,Bin Fu,Zhengshen Jiang,Zhonghai Wu,D. Frank Hsu
DOI: https://doi.org/10.1109/icci-cc.2016.7862054
2016-01-01
Abstract:Feature selection is an important step for data mining and machine learning to deal with the curse of dimensionality. In this paper, we propose a novel feature selection framework based on supervised data clustering. Instead of assuming there only exists low-order dependencies between features and the target variable, the proposed method directly estimates the high-dimensional mutual information between a candidate feature subset and the target variable through supervised data clustering. In addition, it can automatically determine the number of features to be selected instead of manually setting it in a prior. Experimental results show that the proposed method performs similar or better compared with state-of-the-art feature selection methods.
What problem does this paper attempt to address?