Systematic Evaluation of Synthetic Data Augmentation for Multi-class NetFlow Traffic

Maximilian Wolf,Dieter Landes,Andreas Hotho,Daniel Schlör
2024-08-28
Abstract:The detection of cyber-attacks in computer networks is a crucial and ongoing research challenge. Machine learning-based attack classification offers a promising solution, as these models can be continuously updated with new data, enhancing the effectiveness of network intrusion detection systems (NIDS). Unlike binary classification models that simply indicate the presence of an attack, multi-class models can identify specific types of attacks, allowing for more targeted and effective incident responses. However, a significant drawback of these classification models is their sensitivity to imbalanced training data. Recent advances suggest that generative models can assist in data augmentation, claiming to offer superior solutions for imbalanced datasets. Classical balancing methods, although less novel, also provide potential remedies for this issue. Despite these claims, a comprehensive comparison of these methods within the NIDS domain is lacking. Most existing studies focus narrowly on individual methods, making it difficult to compare results due to varying experimental setups. To close this gap, we designed a systematic framework to compare classical and generative resampling methods for class balancing across multiple popular classification models in the NIDS domain, evaluated on several NIDS benchmark datasets. Our experiments indicate that resampling methods for balancing training data do not reliably improve classification performance. Although some instances show performance improvements, the majority of results indicate decreased performance, with no consistent trend in favor of a specific resampling technique enhancing a particular classifier.
Cryptography and Security,Machine Learning
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is the imbalanced data problem faced by multi - class attack classification in network intrusion detection systems (NIDS). Specifically, the author focuses on: 1. **The impact of imbalanced data on multi - class attack classification**: In multi - class attack classification tasks, there are fewer samples of certain types of attacks, resulting in the problem of class imbalance during model training. This imbalance will degrade the performance of the classifier, especially for the identification of attacks in the minority class. 2. **Evaluation of the effectiveness of data augmentation methods**: To address the imbalanced data problem, researchers have proposed various data augmentation methods, including traditional resampling techniques and modern generative models. However, there is currently a lack of systematic comparison and evaluation of these methods in the NIDS field. Therefore, this paper aims to evaluate the effects of different data augmentation methods through a systematic framework, especially the comparison between generative models and traditional resampling techniques. 3. **Selecting appropriate data augmentation methods**: Different data augmentation methods may have different impacts on different classification models. Therefore, the author hopes to find out through experiments which combinations can effectively improve classification performance, or whether a certain method performs better in all cases. ### Research background and motivation In network intrusion detection, multi - class attack classification can more accurately identify specific attack types, thereby providing more targeted responses. However, due to the specificity and rarity of attack events, the datasets used for benchmarking are often characterized by an imbalanced class distribution. This imbalance will affect the performance of the classifier, especially when dealing with the minority class. To solve this problem, researchers have proposed various data augmentation methods, such as traditional methods like random oversampling, SMOTE, and methods based on modern generative models such as generative adversarial networks (GAN) and variational auto - encoders (VAE). ### Main contributions of the paper 1. **Comprehensive evaluation framework**: A systematic test platform has been constructed, which combines multiple class - balance strategies (including classical resampling and modern generative models) and is applied to multiple multi - class classification models. 2. **Comparative analysis**: 42 resampling combinations were evaluated on three datasets, using multiple established classification metrics to ensure a thorough analysis of the effects of different resampling methods. 3. **Reproducibility**: The code and experimental settings are made public to promote reproducibility and further research. ### Experimental results The experimental results show that most resampling strategies have a slight negative impact on the overall performance, especially more obvious for the minority class. Although certain classifier - dependent combinations show improvement, no method can reliably improve performance in all cases. This indicates that randomly choosing resampling and augmentation techniques is not a reliable method to improve multi - class classification performance, but should be regarded as hyper - parameters and optimized for each specific classifier. Through the above analysis, the paper emphasizes the importance of selecting appropriate data augmentation methods in the NIDS field and points out that future research should focus more on how to optimize these methods to adapt to different classification models.