A Graph Representation Learning Algorithm for Low-Order Proximity Feature Extraction to Enhance Unsupervised IDS Preprocessing

Yiran Hao,Yiqiang Sheng,Jinlin Wang
DOI: https://doi.org/10.3390/app9204473
2019-01-01
Applied Sciences
Abstract:Most existing studies on an unsupervised intrusion detection system (IDS) preprocessing ignore the relationship among packets. According to the homophily hypothesis, the local proximity structure in the similarity relational graph has similar embedding after preprocessing. To improve the performance of IDS by building a relationship among packets, we propose a packet2vec learning algorithm that extracts accurate local proximity features based on graph representation by adding penalty to node2vec. In this algorithm, we construct a relational graph G′ by using each packet as a node, calculate the cosine similarity between packets as edges, and then explore the low-order proximity of each packet via the penalty-based random walk in G′. We use the above algorithm as a preprocessing method to enhance the accuracy of unsupervised IDS by retaining the local proximity features of packets maximally. The original features of the packet are combined with the local proximity features as the input of a deep auto-encoder for IDS. Experiments based on ISCX2012 show that the proposal outperforms the state-of-the-art algorithms by 11.6% with respect to the accuracy of unsupervised IDS. It is the first time to introduce graph representation learning for packet-embedded preprocessing in the field of IDS.
What problem does this paper attempt to address?