Feature selection based on mutual information with correlation coefficient

Hongfang Zhou,Xiqian Wang,Rourou Zhu
DOI: https://doi.org/10.1007/s10489-021-02524-x
IF: 5.3
2021-08-12
Applied Intelligence
Abstract:Feature selection is an important preprocessing process in machine learning. It selects the crucial features by removing irrelevant features or redundant features from the original feature set. Most of feature selection algorithms focus on maximizing relevant information and minimizing redundant information. In order to remove more redundant information in the evaluation criteria, we propose a feature selection based on mutual information with correlation coefficient (CCMI) in this paper. We introduce the correlation coefficient in the paper, and combine the correlation coefficient and mutual information to measure the relationship between different features. We use the absolute value of the correlation coefficient between two different features as the weight of the redundant item denoted by the mutual information in the evaluation standard. In order to select low redundancy features effectively, we also use the principle of minimization in the evaluation criteria. By comparing with 7 popular contrast algorithms in 12 data sets, CCMI has achieved the highest average classification accuracy for two classifiers of SVM and KNN. Experimental results show that our proposed CCMI has better feature classification capability.
computer science, artificial intelligence
What problem does this paper attempt to address?