Robust Feature Selection for IM Applications at Early Stage Traffic Classification Using Machine Learning Algorithms

Muhammad Shafiq,Xiangzhan Yu,Dawei Wang
DOI: https://doi.org/10.1109/hpcc-smartcity-dss.2017.31
2017-01-01
Abstract:Identification of network traffic accurately at its early stage is very important for network traffic management and application traffic classification. In recent years, this becomes very hot topic to identify traffic at its early stage. Unidirectional and bidirectional statistical features are effective features and widely used in Internet traffic classification. However, it is important to evaluate and select effective features for Instant Messaging (IM) application traffic classification at early stage. In this paper we are interested to find out robust and effective features at early stage. We firstly extract 22 statistical features of the first flow on two different network environment traffic datasets include on HIT and NIMS datasets. Then mutual information is conducted between the extract statistical features to select the effective features. Additionally to select robust features, we execute attribute selection cfsSubsetEval with Best search evaluator that select the robust and stable features from the result achieved by mutual information. And then, we execute 10 well-known machine learning classifiers. Our experimental results show that max_fpktl, std_bpktl, max_biat, mean_fpktl, mean_bpktl and min_biat feature are robust features at early stage traffic classification.
What problem does this paper attempt to address?