A Feature Selection Method Using Conditional Correlation Dispersion and Redundancy Analysis

Zhang, Li
DOI: https://doi.org/10.1007/s11063-023-11256-7
IF: 2.565
2023-04-01
Neural Processing Letters
Abstract:Many irrelevant and redundant features are commonly found in high-dimensional small sample data. Feature selection effectively solves high-dimensional minor sample problems by removing many irrelevant and redundant features and improving the algorithm's accuracy. In some information-theoretic-based feature selection algorithms, the problem is that choosing different parameters means choosing different feature selection algorithms. How to dynamically circumvent the pre-determined a priori parameters become an urgent problem to be solved. The paper proposes a dynamic weighted conditional relevance dispersion and redundancy analysis (WRRFS) algorithm for feature selection. Firstly, the algorithm uses mutual information to calculate feature correlations and redundancy between features. Secondly, calculate the mean of the feature correlation terms, and the parameter weights of the conditional feature correlation terms are dynamically adjusted using the standard deviation. Finally, WRRFS is validated against other feature selection algorithms on three classifiers using 12 different datasets with classification accuracy metrics (f1_macro,f1_micro, and f1_weighted). The experimental results show that the WRRFS algorithm can improve the quality of feature subsets and increase classification accuracy.
computer science, artificial intelligence
What problem does this paper attempt to address?