A hybrid machine learning approach for feature selection in designing intrusion detection systems (IDS) model for distributed computing networks
Yashar Pourardebil Khah,Mirsaeid Hosseini Shirvani,Homayun Motameni
DOI: https://doi.org/10.1007/s11227-024-06677-7
IF: 3.3
2024-12-08
The Journal of Supercomputing
Abstract:This paper presents a hybrid Machine Learning (ML)-based feature selection algorithm to create an Intrusion Detection Systems (IDS) model in distributed computing environments such as in Internet of Things (IoT) requests. IoT applications need frequent network connections to reach either the cloud or fog resources. Since reliability and trust are very prominent attributes in distributed systems, utilizing the ML-based algorithm is vital to efficiently detect malicious/attack behaviors at the earliest time with high accuracy. To address the issue, a hybrid ML-based algorithm is presented that includes four phases. In the first phase, the pre-processing including cleansing and normalization on the relevant dataset is performed. Then, the two next phases, namely heuristic-based and meta-heuristic-based approaches, are passed for the training step. In the second phase, three effective filter-based heuristic feature rankers are utilized to sort features according to their importance. Afterward, the fuzzy TOPSIS approach is also applied to prepare a consensus among them returning top- K features. In the third phase, to optimize the consensus feature subsets, a novel Discrete Gray Wolf Optimization Algorithm (DGWA) as a meta-heuristic approach is designed. It leads to a balance in exploitation and exploration searching areas. The effectiveness of the proposed hybrid model is tested in the fourth phase on the famous NSL-KDD and UNWS-NB15 datasets against some state of the art. The simulation results of running different scenarios prove that the proposed hybrid ML-based model gives an average 10.60%, 15.85%, 3.30%, 4.39%, and 2.03% improvement in testing datasets in terms of accuracy , precision , recall , F-cost , and specificity against other comparative state of the art in the same conditions.
computer science, theory & methods,engineering, electrical & electronic, hardware & architecture