Information Entropy Based Clustering Method for Unsupervised Internet Traffic Classification

Jing Yuan,Zhu Li,Ruixi Yuan
DOI: https://doi.org/10.1109/icc.2008.307
2008-01-01
Abstract:In this paper, we develop an unsupervised machine learning method for network traffic classification. First, the traffic flows from active hosts are separated into different clusters by sequentially applying the information entropy techniques. The clusters are then classified into broad-based application types by examining the parameters of the clusters, as well as the dynamic properties of the clusters during the clustering process. Experiment results from a campus backbone environment are presented, where the network traffic flows from the top 20 active hosts are identified. The results showed that the identification accuracy is approximately 93.81%. This compares to prior works using unsupervised machine learning methods, where the classification accuracy is generally lower, with the highest being 86.5%. We also show that our method can be combined with supervised learning method to further improve the classification accuracy and perform automatic training on the classification engine.
What problem does this paper attempt to address?