Traffic Labeller: Collecting Internet traffic samples with accurate application information

peng lizhi,zhang hongli,yang bo,chen yuehui,wu tong
DOI: https://doi.org/10.1109/CC.2014.6821309
2014-01-01
Abstract:Traffic classification research has been suffering from a trouble of collecting accurate samples with ground truth. A model named Traffic Labeller (TL) is proposed to solve this problem. TL system captures all user socket calls and their corresponding application process information in the user mode on a Windows host. Once a sending data call has been captured, its 5-tuple {source I P, destination I P, source port, destination port and transport layer protocol}, associated with its application information, is sent to an intermediate NDIS driver in the kernel mode. Then the intermediate driver writes application type information on TOS field of the IP packets which match the 5-tuple. In this way, each IP packet sent from the Windows host carries their application information. Therefore, traffic samples collected on the network have been labelled with the accurate application information and can be used for training effective traffic classification models.
What problem does this paper attempt to address?