Empirical Evaluation of the Performance of Feature Selection Approaches on Random Forest

Smitha S Kumar,Talal Shaikh
DOI: https://doi.org/10.1109/comapp.2017.8079769
2017-09-01
Abstract:Medical data contain very valuable information which can save many lives if it is analyzed and utilized efficiently. Efficient analysis of this large volume of data demands the right choice of predictors and this in turn can impact the accuracy of the decision support system. Dimensionality reduction and feature subset selection are two techniques to reduce the number of features used in classification. In this paper we perform an empirical evaluation of four feature selection methods when applied in conjunction with Random Forest classifier. The feature selection techniques applied are Relief feature selection algorithm, Random forest selector, Recursive feature elimination and Boruta Feature selection algorithm. Results show that feature selection methods boosts the performance of the classifiers and in this case the features selected by the Boruta feature selection algorithm gives the best results.
What problem does this paper attempt to address?