Noise-Resistant Statistical Traffic Classification.

Binfeng Wang,Jun Zhang,Zili Zhang,Lei Pan,Yang Xiang,Dawen Xia
DOI: https://doi.org/10.1109/tbdata.2017.2735996
2017-01-01
IEEE Transactions on Big Data
Abstract:Network traffic classification plays a significant role in cyber security applications and management scenarios. Conventional statistical classification techniques rely on the assumption that clean labelled samples are available for building classification models. However, in the big data era, mislabelled training data commonly exist due to the introduction of new applications and lack of knowledge. Existing statistical traffic classification techniques do not address the problem of mislabelled training data, so their performance become poor in the presence of mislabelled training data. To meet this challenge, in this paper, we propose a new scheme, Noise-resistant Statistical Traffic Classification (NSTC), which incorporates the techniques of noise elimination and reliability estimation into traffic classification. NSTC estimates the reliability of the remaining training data before it builds a robust traffic classifier. Through a number of traffic classification experiments on two real-world traffic data sets, the results show that the new NSTC scheme can effectively address the problem of mislabelled training data. Compared with the state of the art methods, NSTC can significantly improve the classification performance in the context of big unclean data.
What problem does this paper attempt to address?