Feature Subset Selection Combining Maximal Information Entropy and Maximal Information Coefficient

Kangfeng Zheng,Xiujuan Wang,Bin Wu,Tong Wu
DOI: https://doi.org/10.1007/s10489-019-01537-x
IF: 5.3
2019-01-01
Applied Intelligence
Abstract:Feature subset selection is an efficient step to reduce the dimension of data, which remains an active research field in decades. In order to develop highly accurate and fast searching feature subset selection algorithms, a filter feature subset selection method combining maximal information entropy (MIE) and the maximal information coefficient (MIC) is proposed in this paper. First, a new metric mMIE-mMIC is defined to minimize the MIE among features while maximizing the MIC between the features and the class label. The mMIE-mMIC algorithm is designed to evaluate whether a candidate subset is valid for classification. Second, two searching strategies are adopted to identify a suitable solution in the candidate subset space, including the binary particle swarm optimization algorithm (BPSO) and sequential forward selection (SFS). Finally, classification is performed on UCI datasets to validate the performance of our work compared to 9 existing methods. Experimental results show that in most cases, the proposed method behaves equally or better than the other 9 methods in terms of classification accuracy and F1-score.
What problem does this paper attempt to address?