Passive browser identification with multi-scale Convolutional Neural Networks

Saeid Samizade,Chao Shen,Chengxiang Si,Xiaohong Guan
DOI: https://doi.org/10.1016/j.neucom.2019.10.028
IF: 6
2020-01-01
Neurocomputing
Abstract:Browser identification is the act of recognizing web traffic through surveillance despite the use of encryption or anonymizing software. Although previous work has reported some promising results, browser fingerprinting is still an emerging technique and has not reached an acceptable level of performance. This paper presents a novel approach by using deep-convolutional-neural-network-based (deep CNN) learning model to extract the complete shape of traffic I/O graph signal in obtaining stable traffic characteristics, employing nonlinear multi-class classification algorithms to perform the task of browser identification. The approach is evaluated on a new dataset collected across a large number of websites. Extensive experimental results show that traffic characteristics which are learned from I/O graph by deep CNN are much more stable and discriminative than the metrics those are obtained from the early studies, and the approach achieves a practically useful level of performance with significant precision and recall. Additional experiments on the depth of deep CNN are provided to further examine the applicability of our approach. Our dataset is publicly available to facilitate future research. (C) 2019 Elsevier B.V. All rights reserved.
What problem does this paper attempt to address?