ADASYN-Random Forest Based Intrusion Detection Model

Zhewei Chen,Wenwen Yu,Linyue Zhou
DOI: https://doi.org/10.1145/3483207.3483232
2022-04-14
Abstract:Intrusion detection has been a key topic in the field of cyber security, and the common network threats nowadays have the characteristics of varieties and variation. Considering the serious imbalance of intrusion detection datasets will result in low classification performance on attack behaviors of small sample size and difficulty to detect network attacks accurately and efficiently, using Adaptive Synthetic Sampling (ADASYN) method to balance datasets was proposed in this paper. In addition, Random Forest algorithm was used to train intrusion detection classifiers. Through the comparative experiment of Intrusion detection on CICIDS 2017 dataset, it is found that ADASYN with Random Forest performs better. Based on the experimental results, the improvement of precision, recall, F1 scores and AUC values after ADASYN is then analyzed. Experiments show that the proposed method can be applied to intrusion detection with large data, and can effectively improve the classification accuracy of network attack behaviors. Compared with traditional machine learning models, it has better performance, generalization ability and robustness.
Cryptography and Security,Machine Learning
What problem does this paper attempt to address?
This paper attempts to solve the problem of dataset imbalance in network intrusion detection, which can lead to poor classification performance of small - sample attack behaviors and make it difficult to detect network attacks accurately and efficiently. The author proposes to use the Adaptive Synthetic Sampling (ADASYN) method to balance the dataset and combine the Random Forest algorithm to train an intrusion detection classifier in order to improve the classification accuracy of network attack behaviors. Through comparative experiments on the CICIDS 2017 dataset, it is verified that the method of combining ADASYN and Random Forest is superior to traditional machine - learning models and shows better performance, generalization ability and robustness in terms of precision, recall rate, F1 - score and AUC value.