A High Performance Intrusion Detection System Using LightGBM Based on Oversampling and Undersampling

Hao Zhang,Lina Ge,Zhe Wang
DOI: https://doi.org/10.1007/978-3-031-13870-6_53
2022-01-01
Abstract:Intrusion detection system plays an important role in network security, however, the problem with data imbalance limits the detection ability of intrusion detection system. In order to improve the performance of intrusion detection system, this paper proposes to use the adaptive synthetic sampling technique (ADASYN) and random under sampling technique to alleviate the problem of data imbalance in intrusion detection. Firstly, the majority class samples in the dataset are removed by undersampling technology and the minority class samples are oversampled, so the samples can reach a balanced state. Subsequently, a sparse autoencoder (SAE) extracts features from the resampled data to fit the original sample as closely as possible. Finally, LightGBM is applied on the processed dataset for the classification process. Multi-classification experiments were conducted on KDD99 and UNSWNB15 datasets. We compare six models' performance and find LightGBM is superior to other models. Furthermore, we also compare existing methods and the results show that our proposed method outperforms current methods.
What problem does this paper attempt to address?