Multi-Class Network Anomaly Detection Using Machine Learning Techniques
Satyanarayana Gunupusala,Shahu Chatrapathi Kaila
DOI: https://doi.org/10.37256/cm.5220243723
2024-06-19
Contemporary Mathematics
Abstract:Computer networks rely on Intrusion Detection Systems (IDSs) and Intrusion Prevention Systems (IPSs) to ensure the security, reliability, and availability of an organization. In recent years, various approaches were developed and implemented to create effective IDSs and IPSs. This paper specifically focuses on IDSs that utilize Machine Learning (ML) techniques for improved accuracy. ML-based IDSs have verified to be successful in discovering network attacks. However, their performance tends to decline when dealing with high-dimensional data spaces. It is essential to develop a suitable feature extraction strategy that could identify and remove irrelevant features that do not significantly classification process to address this issue. Additionally, many ML-based IDSs exhibit high false positive rates and poor detection accuracy when trained on unbalanced datasets. In this study, we analyze the UNSW-NB15 IDS, which will serve as the training and testing data for our models. In order to reduce the feature space and improve the efficiency of our analysis, we leverage a filter-based feature reduction method utilizing the Pearson correlation coefficient algorithm. By identifying and selecting only the most relevant features, we are able to streamline our dataset and focus on the variables that have the highest impact on our analysis. This approach not only reduces computational complexity but also improves the interpretability of our results by eliminating unnecessary noise from the data. After applying the feature reduction technique, we proceed to implement a range of machine learning methods to perform our classification task. These include well-known algorithms such as Stacking, Extra Trees, Multi-Layer Perceptron, XGBoost, K-Nearest Neighbors, Logistic Regression, Naïve Bayes, Support Vector Machine, Random Forest, and Decision Tree. By employing a diverse set of algorithms, we are able to explore different modeling approaches and evaluate their effectiveness in accurately classifying the various types of assaults. In order to assess the performance of our classification models, we utilize a range of specialized evaluation metrics such as Root Mean Square Error (RMSE), Mean Absolute Error (MAE), R2-Score, Mean Squared Error (MSE), Precision, F1-Score, Recall, and Accuracy. These metrics provide us with a comprehensive understanding of how well our models are performing across different dimensions, including the accuracy of predictions, the level of precision in classifying different assault types, and the overall goodness-of-fit of our models. By considering multiple evaluation metrics, we are able to gain a more nuanced understanding of the strengths and weaknesses of each algorithm and make informed decisions about their suitability for our classification task. These metrics deliver a complete evaluation of the classifiers’ effectiveness in detecting community intrusions.