Sequential Message Characterization for Early Classification of Encrypted Internet Traffic
Wenxiong Chen,Feng Lyu,Fan Wu,Peng Yang,Guangtao Xue,Minglu Li
DOI: https://doi.org/10.1109/tvt.2021.3063738
IF: 6.8
2021-01-01
IEEE Transactions on Vehicular Technology
Abstract:Classifying Internet traffic is critical to many network management tasks, including malicious attack detection, usage monitoring, load balancing, etc. As current traffic packets are often transmitted with encryption, at randomized port numbers, and under highly dynamic network conditions, traditional approaches such as port mapping, deep packet inspection, and statistical analysis are no longer effective. In this paper, we first collect extensive traffic flows at the exit router of a university and label them into various source applications. After extracting the message (consisting of multiple consecutive TCP packets) sequence for all collected traffic flows, we find that each application type has distinct sequential message features. By leveraging the message sequential feature, we develop a system, named SMC (Sequential Message Characterization), which can perform online traffic classification with the sequential size information of a few message segments. In SMC, after confirming the long-term dependency among message segments, we create a Long Short-Term Memory (LSTM) neural network to conduct deep learning on message size sequence, and then build a multi-classifier to classify traffic types based on the probability profiles output by deep LSTM models. Extensive experiments are conducted and results demonstrate that the proposed SMC can achieve 97% of classification accuracy on average. Meanwhile, with as few as 6 pieces of message size information as input, SMC enables early online traffic classification especially for heavy-traffic flows with over 35 message segments in median.