A FLOW NATURE CLASSIFICATION METHOD BASED ON MULTI-FEATURES OF N-GRAM
Jie Ding,Liang Huang,Yupeng Tuo,Yafei Sang,Yongzheng Zhang
DOI: https://doi.org/10.3969/j.issn.1000-386x.2017.02.026
2017-01-01
Abstract:Accurate and efficient network traffic classification technology plays a significant role for improving network quality of service,optimizing network bandwidth allocation,enhancing network security management and research in network-related field.Currently,most research work in network traffic classification field focuses on how to identify the type of the network applications and protocols.However,the existing methods cannot be applied to analyze and classify the unknown traffic (generated by unknown applications or protocols) and encrypted traffic.Therefore,a flow nature classification method based on multi-features of n-gram is proposed,which can classify the network traffic according to the content type of the payload in the packets (including text,audio,video,picture,executable,compressed and encrypted).Firstly,a set of frequent subsequences by means of a threshold-based strategy is selected,then multifeatures is extracted to characterize the frequency distribution of the selected set,finally a desirable performance is obtained to classify the content type of the traffic payload by employing C4.5 decision tree classifier.The experimental results show that the average precision ratio and the average recall ratio of our approach achieve 92.7% and 91.9% respectively with only 1 KB of data per flow.Compared with the classification methods based on entropy features,the proposed method increases by approximately 10.8% and 12.1% in terms of average precision ratio and average recall ratio.