Extending labeled mobile network traffic data by three levels traffic identification fusion

Zhen Liu,Ruoyu Wang,Deyu Tang
DOI: https://doi.org/10.1016/j.future.2018.05.079
IF: 7.307
2018-11-01
Future Generation Computer Systems
Abstract:Mobile traffic classification is critically important for the decision-making of network management such as traffic shaping and traffic pricing. Labeled traffic data are the requisite of classification performance evaluation. However, existing works mostly acquired labeled traffic on a simulation environment such as individually running a specific app on mobile devices to collect its traffic. This way is slow and not scalable. This paper devises a scheme to automatically link the ground truth to mobile traffic. A set of labeled traffic data are firstly collected by our previously presented mobilegt (a system to collect mobile traffic and build the ground truth) on the monitored mobile devices. But these traffic are limited to the monitored nodes. Therefore, we present a method named ELD (Extending Labeled Data) to identify the label of newly unknown mobile traffic, so as to extend the labeled mobile traffic data. ELD proceeds traffic identification into packet header, packet payload and flow statistic levels. The three levels’ traffic identification tasks are implemented by ServerTag, payload distribution inspection and Random Forest respectively. ELD is able to identify the mobile traffic with encrypted payload. The cross validation results show that ELD achieves 99% flow accuracy and 95.4% byte accuracy on average when the flow and byte completeness are respectively 86.5% and 65.5%. The results also prove that ELD outperforms existing works, nDPI and Libprotoident, on labeling mobile network traffic.
What problem does this paper attempt to address?