Performance Analysis of Anomaly-Based Network Intrusion Detection Using Feature Selection and Machine Learning Techniques

Sumedha Seniaray,Rajni Jindal
DOI: https://doi.org/10.1007/s11277-024-11602-5
IF: 2.017
2024-10-05
Wireless Personal Communications
Abstract:Data and information, being a critical part of the Internet, are vital to network security. Intrusion Detection System (IDS) is required to preserve confidentiality, data integrity, and system availability from attacks. IDS collects network data from various places that may contain features that are redundant and irrelevant, leading to an increase in processing time and low detection rate. This study proposes a three-phase network-based IDS to counter this issue. Initially, network data is captured and preprocessed. In the second phase, we perform feature extraction, selection, and ranking to obtain the optimal feature set. A novel Dynamic Mutual Information-based Genetic Algorithm for feature selection (DMI-GA), aiming to enhance the performance of machine learning (ML) techniques by identifying an optimal set of features, is also proposed in this work. Finally, well-known ML models are employed to detect intrusions within this refined set of network traffic features. Experimental results demonstrate a significant improvement in detection accuracy when the ML models are trained and tested on an optimal set of features. It is also observed that DMI-GA combined with the Random Forest classifier, achieves the highest detection accuracy of 99.94%, surpassing the performance of existing state-of-the-art anomaly-based network intrusion detection systems. A comprehensive statistical analysis of these ML methods is also conducted using 10-fold and Leave-One-Out cross-validation strategies, as it mitigates overfitting and offers a thorough evaluation of the model's performance, resulting in an average accuracy of 99.91%.
telecommunications
What problem does this paper attempt to address?