Improved discrete salp swarm algorithm using exploration and exploitation techniques for feature selection in intrusion detection systems
Malek Barhoush,Bilal H. Abed-alguni,Nour Elhuda A. Al-qudah
DOI: https://doi.org/10.1007/s11227-023-05444-4
IF: 3.3
2023-06-19
The Journal of Supercomputing
Abstract:The salp swarm algorithm (SSA) is a well-known optimization algorithm that is increasingly being utilized to solve many sorts of optimization problems. However, SSA may converge to sub-optimal solutions when it is applied to discrete problems such as the feature selection (FS) problem. This paper presents the enhanced opposition-based learning salp swarm algorithm (EOSSA), which is an improved SSA algorithm for solving the FS problem in intrusion detection systems (IDS). EOSSA incorporates four improvements into the original SSA algorithm. Firstly, the opposition-based learning (OBL) method is used in the initialization step of SSA to boost its population diversity. Secondly, the Elite opposition-based learning (EOBL) is used in the improvement loop of SSA to improve its exploration ability. Third, a variable neighborhood search (VNS) method is used in the improvement loop of SSA to improve its exploration mechanism to improve the local search space. Lastly, the Sigmoid binary transform function is used to convert the continuous candidate solutions produced by SSA into discrete binary solutions. EOSSA was evaluated against eighteen popular optimization algorithms (e.g., improved salp swarm algorithm based on opposition-based learning (ISSA), SSA, particle swarm algorithm (PSO), cuckoo search (CS), bat algorithm (BA), and Harris Hawk Optimization (HHO)) using eleven popular intrusion detection datasets (CICIDS2017, CSE-CIC-IDS2018, CICDDOS2019, CIRA-CIC-DoH, Intrusion detection 2018, UNSW-NB15, NSL-KDD, Phishing Legitimate, Malmem2022, IoT, and LUFlow Network) to Detect IoT Botnet Attacks. Moreover, EOSSA was compared with four machine learning algorithms (Decision Tree (DT), logistic regression (LR), Naive Bayes (NB), and K-Nearest Neighborhood (KNN)). The overall simulation results suggested that the proposed method is superior to the other algorithms in terms of the accuracy and number of selected features. The statistical analysis of the simulation results using the Friedman and Wilcoxon signed-rank test confirms the superiority of the proposed method.
computer science, theory & methods,engineering, electrical & electronic, hardware & architecture