Grid-based Data Stream Clustering for Intrusion Detection

Q. Qian,Chao-Jie Xiao,Rui Zhang
2013-01-01
Abstract:As a kind of stream data mining method, stream clustering has great potentiality in areas such as network traffic analysis, intrusion detection, etc. This paper proposes a novel grid-based clustering algorithm for stream data, which has both advantages of grid mapping and DBSCAN algorithm. The algorithm adopts the two-phase model and in the online phase, it maps stream data into a grid and the geometric center of all the data in the grid is used to represent the characteristic of entire data in the grid approximately. In the offline phase, grid-based DBSCAN clustering algorithm is used to cluster all grids in the space based on density. Meanwhile, extension of the algorithm to an incremental one is also presented in detail in the paper. The algorithm proposed in the paper can solve the problem that it is difficult to find neighbor grids in DStream algorithm and also solve the incompetency of DBSCAN in data compression, which makes it capable for DBSCAN to be used for stream data. Experimental results on KDDCUP99 intrusion detection dataset show that the algorithm can achieve a good clustering quality and efficiency. The average accuracy is above 92% and the highest order of magnitude of SSQ is 104 and the average processing time of 10,000 sessions is about 3 seconds.
Computer Science
What problem does this paper attempt to address?