Optimizing Precision for Open-World Website Fingerprinting

Tao Wang
DOI: https://doi.org/10.48550/arXiv.1802.05409
2018-02-15
Abstract:Traffic analysis attacks to identify which web page a client is browsing, using only her packet metadata --- known as website fingerprinting --- has been proven effective in closed-world experiments against privacy technologies like Tor. However, due to the base rate fallacy, these attacks have failed in large open-world settings against clients that visit sensitive pages with a low base rate. We find that this is because they have poor precision as they were designed to maximize recall. In this work, we argue that precision is more important than recall for open-world website fingerprinting. For this reason, we develop three classes of {\em precision optimizers}, based on confidence, distance, and ensemble learning, that can be applied to any classifier to increase precision. We test them on known website fingerprinting attacks and show significant improvements in precision. Against a difficult scenario, where the attacker wants to monitor and distinguish 100 sensitive pages each with a low mean base rate of 0.00001, our best optimized classifier can achieve a precision of 0.78; the highest precision of any known attack before optimization was 0.014. We use precise classifiers to tackle realistic objectives in website fingerprinting, including selection, identification, and defeating website fingerprinting defenses.
Cryptography and Security
What problem does this paper attempt to address?