Improved versions of snake optimizer for feature selection in medical diagnosis: a real case COVID-19
Malik Sh. Braik,Abdelaziz I. Hammouri,Mohammed A. Awadallah,Mohammed Azmi Al-Betar,Omar A. Alzubi
DOI: https://doi.org/10.1007/s00500-023-09062-3
IF: 3.732
2023-08-16
Soft Computing
Abstract:Classification of medical data is largely dependent on the effective identification of key features of the data that can be used to aid in the diagnosis of related diseases. This goal can be achieved through feature selection methods that endeavor to get rid of redundant and irrelevant features to ameliorate classification accuracy. This is the aim of this work where a new meta-heuristic, referred to as snake optimizer, was adopted for the purpose of boosting the performance of existing feature selection methods. This optimizer may smoothly fall into local optimal solutions, which may present weak search performance and slow convergence speeds in handling feature selection problems. On this basis, this paper presents three improved adaptive versions of this optimizer, each of which has increased search performance over the basic optimizer. This optimizer was improved using three mathematical models named exponential, power, and delayed S-shaped, to create three methods, referred to as exponential, power, and delayed S-shaped snake optimizers, respectively. These proposed versions were also matured to have more balance between exploration and exploitation aspects. Then, binary variants of these optimizers were evolved to solve feature selection problems using the k-nearest neighbor classifier. To verify the efficacy of these binary optimizers, 24 datasets were used, and then compared with other feature selection optimizers. The experimental results obviously manifested the efficiency of the proposed optimizers in realizing the optimal feature set by achieving utmost accuracy and minimal number of features in the majority of the studied datasets. The proposed binary power snake optimizer outperformed all other competitors in 13, 10, 8, 8, and 12 datasets in respect of classification accuracy, number of chosen attributes, specificity, sensitivity, and fitness scores, respectively. Out of the 24 datasets taken into consideration, the results on 12, 6, and 8 datasets, respectively, showed that this proposed optimizer presented performance scores of more than 90% in respect of sensitivity, accuracy, and specificity metrics.
computer science, artificial intelligence, interdisciplinary applications