Research of Clustering Algorithm Based on Information Entropy and Frequency Sensitive Discrepancy Metric in Anomaly Detection

Han Li,Qiuxin Wu
DOI: https://doi.org/10.1109/iscc-c.2013.108
2014-01-01
Abstract:Anomaly detection is an active branch of intrusion detection technology which can detect intrusion behaviors including system or users' non-normal behavior and unauthorized use of computer resources. Clustering analysis is an unsupervised method to group data set into multiple clusters. Using clustering algorithm to detect anomaly behavior has good scalability and adaptability. This paper mainly focuses on improving k-means clustering algorithm, and uses it to detect the abnormal records. Our goal is to increase the DR value and decrease the FAR value in anomaly detection by calculating appropriate value of parameters and improve the clustering algorithm. In our IE&FSDM algorithm, we use network records' minimum standard information entropy to compute the initial cluster centers. In testing phase, discrepancy metric is introduced to help calculate exact number of clusters in testing data set. Using the results of initial cluster centers calculated in the pre-phase, IE&FSDM compute the actual clusters by converging cluster centers and obtains the actual cluster centers according to the frequency sensitive discrepancy metric. Then comply with the improved k-means algorithm, iterative calculate until divide all network data into corresponding clusters, and according to the results of cluster we can classify the normal and abnormal network behaviors. At last, we use KDD CUP1999 dataset to implement IE&FSDM algorithm. Test results show that comparing with previous clustering methods, IE&FSDM algorithm improve the detection rate of anomaly behavior and reduce the false alarm rate.
What problem does this paper attempt to address?