Interaction-based clustering algorithm for feature selection: a multivariate filter approach

Ahmad Esfandiari,Hamid Khaloozadeh,Faezeh Farivar
DOI: https://doi.org/10.1007/s13042-022-01726-0
2023-04-21
International Journal of Machine Learning and Cybernetics
Abstract:In pattern recognition and data mining, feature selection is a preprocessing step during which the dimensions of data are reduced by removing redundant, irrelevant, and noisy features for a machine learning task. Identifying the most informative features in a suitable computational time is one of the most important challenges in the existing feature selection methods. This paper introduces a multivariate filter feature selection method based on feature clustering technique called interaction-based feature clustering (IFC), which is very cost-effective in terms of computational cost while achieving high classification accuracy. In the proposed method, first, the features are ranked based on the symmetric uncertainty criterion, and then, the clustering of the features is performed by calculating their interactive weight as a similarity measure. To evaluate the performance, the results of the IFC algorithm are compared with six well-known multivariate filter methods on sixteen benchmark datasets using three classifiers of SVM, NB and kNN. In addition, for further evaluation, a comparison is made using the Akaike Information Criterion (AIC) and Pareto front curves. Experimental results prove that the IFC algorithm is often more efficient than the comparable methods in terms of classification accuracy and computational time and can be considered as a suitable method in the preprocessing step.
computer science, artificial intelligence
What problem does this paper attempt to address?