Web Tracking Technologies and Protection Mechanisms
Nataliia Bielova
DOI: https://doi.org/10.1145/3133956.3136067
2017-10-30
Abstract:Billions of users browse the Web on a daily basis, leaving their digital traces on millions of websites. Every such visit, every mouse move or button click may trigger a wide variety of hidden data exchanges across multiple tracking companies. As a result, these companies collect a vast amount of user's data, preferences and habits, that are extremely useful for online advertisers and profitable for data brokers, however very worrisome for the privacy of the users. In this \emph{3-hours tutorial} we will cover the vide variety of Web tracking technologies, ranging from simple cookies to advanced cross-device fingerprinting. We will describe the main mechanisms behind web tracking and what users can do to protect themselves. Moreover, we will discuss solutions Web developers can use to automatically eliminate tracking from the third-party content they include in their applications. This tutorial will be of interest to a \emph{general audience} of computer scientists, and \emph{we do not require any specific prerequisite knowledge} for attendees. We will cover the following tracking mechanisms: \begin{itemize} \item third-party cookie tracking, and other stateful tracking techniques that enables tracking across multiple websites, \item cookie respawning that is used to re-create deleted user cookies, \item cookie synching that allows trackers and ad agencies to synchronise user IDs across different companies, \item browser fingerprinting, including Canvas, WebRTC and AudioContext fingerprinting \item cross-browser device fingerprinting, allowing trackers to recognise users across several devices. \end{itemize} We will then demonstrate prevalence of such techniques on the Web, based on previous research. We will present the advertisement ecosystem and explain how Web technologies are used in advertisement, in particular in Real-Time-Bidding (RTB). We will explain how cookie synching is used in RTB and present recent analysis on how much a user's tracking data is worth. We will discuss the mechanisms the website owners use to automatically interact with the ad agencies, and explain its consequences on user's security and privacy. To help users protect themselves from Web tracking, we will give an overview of existing solutions. We'll start with the browser settings, and show that basic third-party cookie tracking is still possible even in the private browser mode of most common Web browsers. We then present privacy-protecting browser extensions and compare how efficient they are in protection from Web tracking. Then, we'll present possible protection mechanisms based on browser randomisation to protect from advanced fingerprinting techniques. Finally, we will present solutions for Web developers, who want to include third-party content in their websites, but would like to automatically remove any tracking of their users. In particular, we will discuss simple solutions that exist today for social plugins integration, and propose more advanced server-side based solutions that are a result of our own research.