A Novel Hybrid Gene Selection Based on Random Forest Approach and Binary Dragonfly Algorithm

Sayed Pedram Haeri Boroujeni,Elnaz Pashaei
DOI: https://doi.org/10.1109/cce53527.2021.9633105
2021-11-10
Abstract:Microarrays dataset contains a huge number of genes and a few samples. This issue can lead to the curse of dimensionality in large datasets. To overcome this challenge, gene selection is a method used for identifying the independent genes and removing redundant or noisy ones from the dataset. This study proposes a novel hybrid approach based on the combination of Random Forest Ranking (RFR) and Binary Dragonfly Algorithm (BDA) to identify the significant genes. The proposed method comprises two steps. In the first step, RFR is employed to remove irrelevant genes and select the subsets of optimal genes. In the second step, BDA is applied to select the most informative genes that can lead to the accurate detection of cancer. The BDA optimizer is a recently proposed metaheuristic algorithm that utilizes Naïve Bayes (NB) classifier as an evaluator. In this paper, four microarray datasets are used to evaluate the performance of the proposed hybrid approach. Experimental results illustrate that the proposed work significantly outperforms existing meta-heuristic methods regarding classification accuracy and the optimal number of selected genes.
What problem does this paper attempt to address?