A Novel Framework for Malicious Encrypted Traffic Classification at Host Level and Flow Level.

Haitao Zhang,Lei Luo,Yun Li,Lirong Chen,Xiaobo Wu
DOI: https://doi.org/10.1109/dsc55868.2022.00076
2022-01-01
Abstract:More and more data is transferred in encrypted ways, especially over HTTPS. As a result, network attackers often use encryption algorithms to transmit control commands to avoid detection. Malware or unidentified users may use unauthorized encrypted proxy to communicate and access malicious websites. Confirming the details of these behaviors facilitates the analysis of suspicious events, but intercepted encrypted traffic cannot be parsed. Since the payload of encrypted traffic is not observable, machine learning algorithm combined with domain knowledge is the mainstream method to detect malicious encrypted traffic. Considering the complexity of decrypting traffic, we start from host level and flow level, exploit packet length histogram, sequence features and statistical features to describe traffic. In particular, we use Word2Vec algorithm to represent packet length sequence. In experiment, we collect the encrypted traffic generated by malware communicating with multiple websites through encrypted proxy. Experiment result shows the effectiveness of our framework and achieve 96.2% Accuracy.
What problem does this paper attempt to address?