Effective fuzzy joint mutual information feature selection based on uncertainty region for classification problem
Omar A.M. Salem,Feng Liu,Yi-Ping Phoebe Chen,Ahmed Hamed,Xi Chen,Omar A.M. Salem,Feng Liu,Yi-Ping Phoebe Chen,Ahmed Hamed,Xi Chen
DOI: https://doi.org/10.1016/j.knosys.2022.109885
2022-12-05
Abstract:Classification problem widely exists in real-world applications. Unfortunately, data quality is the main challenge of the classification models, especially when the data includes irrelevant and redundant features. Feature selection (FS) is an effective preprocessing technique to enhance the quality of the data. For this, an integration of information theory and fuzzy sets introduced powerful measures, such as fuzzy information measures, to develop many feature selection methods. However, estimating fuzzy information measures is not only costly in the space and runtime but also may be affected by the bias between the certainty and uncertainty regions. This paper proposes a novel instance selection based on uncertainty region (ISUR) to overcome these limitations. Then, a state-of-the-art FS method, called fuzzy joint mutual information (FJMI), has been adapted to design an effective FS method, called fuzzy joint mutual information feature selection based on uncertainty region (FJMIUR). The proposed method consists of two processes: instance selection and feature selection. The former selects the uncertainty region that improves the estimation of fuzzy information measures and reduces the consumed cost while the latter selects the most significant features. Using 20 real-world classification datasets, comparative experiments, including well-known and state-of-the-art FS methods, were conducted to evaluate the effectiveness of FJMIUR. The results show the outperformance of FJMIUR in most cases according to six classification measures, average percentage of selected features, space, and runtime.
computer science, artificial intelligence