Bits Learning: User-Adjustable Privacy Versus Accuracy in Internet Traffic Classification

Zhenlong Yuan,Jie Xu,Yibo Xue,Mihaela van der Schaar
DOI: https://doi.org/10.1109/lcomm.2016.2521837
IF: 3.5529
2016-01-01
IEEE Communications Letters
Abstract:During the past decade, a great number of machine learning (ML)-based methods have been studied for accurate traffic classification. Flow features such as the discretizations of the first five packet sizes (PS) and flow ports (FP) are considered the best discriminators for per-flow classification. For the first time, this letter proposes to treat the first n-bits of a flow (BitFlow) as features and compares its overall performance with the well-known ACAS (automated construction of application signatures) that takes the first n-bytes of a flow (ByteFlow) as features. The results show that BitFlow achieves not only a higher classification accuracy but also 1-3 orders of magnitude faster speed than ACAS in training and classifying. More importantly, this letter also proposes to treat the first n-bits of each of the first few packet payloads (BitPack) as features, which enables a user-adjustable tradeoff between user privacy protection and classification accuracy maximization. The experiments show that BitPack can significantly outperform BitFlow, PS, and FP.
What problem does this paper attempt to address?