Positional-Unigram Byte Models for Generalized TLS Fingerprinting

Hector A. Valdez,Sean McPherson
DOI: https://doi.org/10.48550/arxiv.2405.07848
2024-05-13
Cryptography and Security
Abstract:We use positional-unigram byte models along with maximum likelihood for generalized TLS fingerprinting and empirically show that it is robust to cipher stunting. Our approach creates a set of positional-unigram byte models from client hello messages. Each positional-unigram byte model is a statistical model of TLS client hello traffic created by a client application or process. To fingerprint a TLS connection, we use its client hello, and compute the likelihood as a function of a statistical model. The statistical model that maximizes the likelihood function is the predicted client application for the given client hello. Our data driven approach does not use side-channel information and can be updated on-the-fly. We experimentally validate our method on an internal dataset and show that it is robust to cipher stunting by tracking an unbiased $f_{1}$ score as we synthetically increase randomization.
What problem does this paper attempt to address?