Self-Organizing Map assisted Deep Autoencoding Gaussian Mixture Model for Intrusion Detection

Yang Chen,Nami Ashizawa,Seanglidet Yean,Chai Kiat Yeo,Naoto Yanai
DOI: https://doi.org/10.48550/arXiv.2008.12686
2020-08-28
Abstract:In the information age, a secure and stable network environment is essential and hence intrusion detection is critical for any networks. In this paper, we propose a self-organizing map assisted deep autoencoding Gaussian mixture model (SOMDAGMM) supplemented with well-preserved input space topology for more accurate network intrusion detection. The deep autoencoding Gaussian mixture model comprises a compression network and an estimation network which is able to perform unsupervised joint training. However, the code generated by the autoencoder is inept at preserving the topology of the input space, which is rooted in the bottleneck of the adopted deep structure. A self-organizing map has been introduced to construct SOMDAGMM for addressing this issue. The superiority of the proposed SOM-DAGMM is empirically demonstrated with extensive experiments conducted upon two datasets. Experimental results show that SOM-DAGMM outperforms state-of-the-art DAGMM on all tests, and achieves up to 15.58% improvement in F1 score and with better stability.
Machine Learning,Cryptography and Security,Social and Information Networks
What problem does this paper attempt to address?
The problem that this paper attempts to solve is that in network intrusion detection, the existing Deep Auto - Encoding Gaussian Mixture Model (DAGMM) cannot well preserve the topological structure of the input space during the dimension - reduction process, resulting in a decline in performance. Specifically: - **Problem Background**: In the field of network information security, accurate and stable intrusion detection is crucial for any network environment. Existing deep - learning - based intrusion detection methods, such as DAGMM, when dealing with high - dimensional data, although they can identify anomalies through dimension - reduction and density estimation, they lose the spatial topological information of the input data during the dimension - reduction process, which limits their detection accuracy. - **Limitations of Existing Methods**: The auto - encoder in the DAGMM model has difficulty in maintaining the topological structure of the input space when generating low - dimensional representations, which is caused by the design of the bottleneck layer of the auto - encoder. The loss of this topological information makes the model perform poorly in some cases. To solve this problem, the author proposes a Deep Auto - Encoding Gaussian Mixture Model combined with Self - Organizing Map (SOM - DAGMM) to better preserve the topological structure of the input space, thereby improving the accuracy of intrusion detection. ### Specific Improvement Points: 1. **Introduction of SOM**: As an unsupervised learning algorithm, SOM can better preserve the topological structure of the input data while reducing dimensions. By combining the low - dimensional representation generated by SOM with the low - dimensional representation generated by the auto - encoder of DAGMM, the characteristics of the data can be captured more comprehensively. 2. **Two - stage Training Strategy**: Since the training mechanisms of SOM and DAGMM are different, the author adopts a two - stage training strategy. First, use SOM to pre - process the original data and generate a low - dimensional representation; then input these representations together with the low - dimensional representation generated by the auto - encoder and the reconstruction error into DAGMM for joint training. 3. **Experimental Verification**: Through experiments on two standard datasets (NSL - KDD and CSE - CIC - IDS2018), it is proved that SOM - DAGMM has a significant improvement over the original DAGMM in multiple evaluation indicators, especially with an improvement of up to 15.58% in the F1 score, and shows better stability. In summary, this paper aims to improve the performance of DAGMM in intrusion detection tasks by introducing SOM, especially in terms of preserving the topological structure of the input space, so as to achieve more accurate and stable anomaly detection.