Abstract:Browser fingerprinting is the identification of a browser through the network traffic captured during communication between the browser and server. This can be done using the HTTP protocol, browser extensions, and other methods. This paper discusses browser fingerprinting using the HTTPS over TLS 1.3 protocol. The study observed that different browsers use a different number of messages to communicate with the server, and the length of messages also varies. To conduct the study, a network was set up using a UTM hypervisor with one virtual machine as the server and another as a VM with a different browser. The communication was captured, and it was found that there was a 30\%-35\% dissimilarity in the behavior of different browsers.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is to fingerprint browsers through encrypted communication (using the HTTPS protocol). Specifically, the author studied the behavioral differences of different browsers in network communication under the TLS 1.3 protocol and proposed a method based on message length and cipher suite list to distinguish different browsers. ### Research Background and Problems 1. **Differences between HTTP and HTTPS**: - In HTTP communication, since the data is transmitted in plain text, the server can easily identify the client browser through the `user - agent` field. - However, in HTTPS communication, the data is encrypted, and traditional fingerprinting methods are no longer applicable, so new methods are needed to identify browsers. 2. **Importance of Browser Fingerprinting**: - Browser fingerprinting not only helps the server identify the client device, but also can help detect malicious users, because the fingerprints of malicious users are usually different from those of legitimate users. 3. **Limitations of Existing Research**: - Most of the existing research relies on decrypting HTTPS fields or using complex combination sequence tests, and these methods are computationally costly and less efficient. ### Main Contributions of the Paper - **Proposed a New Method**: By analyzing the length of TLS handshakes and data messages and combining the cipher suite list, the author proposed a browser fingerprinting method without decrypting HTTPS fields. - **Experimental Verification**: By setting up a virtual network environment, capturing the communication data between different browsers and servers, and using the interpolation method and cosine similarity to calculate the similarities and differences between browsers. - **Result Presentation**: The experimental results show that there are significant differences in the communication behaviors of different browsers on the same page, and different browsers can be effectively distinguished. ### Formula Representation The paper uses cosine similarity to measure the similarity between browsers: \[ \text{cosine similarity}(\vec{A}, \vec{B})=\frac{\vec{A} \cdot \vec{B}}{\|\vec{A}\| \|\vec{B}\|} \] Cosine dissimilarity is defined as: \[ \text{cosine dissimilarity}(\vec{A}, \vec{B}) = 1-\frac{\vec{A} \cdot \vec{B}}{\|\vec{A}\| \|\vec{B}\|} \] Here, $\vec{A}$ and $\vec{B}$ respectively represent the communication message length vectors of two browsers on a certain webpage. ### Conclusion This research proves that by analyzing the message length and cipher suite list under the TLS 1.3 protocol, different browsers can be effectively fingerprinted. Future work will further expand to more browsers and consider combining other protocols (such as the TCP protocol) to improve the identification accuracy.

Fingerprinting Browsers in Encrypted Communications

A survey of methods for encrypted network traffic fingerprinting

Fingerprinting Search Keywords over HTTPS at Scale

Adaptive Webpage Fingerprinting from TLS Traces

Fingerprinting and Tracing Shadows: The Development and Impact of Browser Fingerprinting on Digital Privacy

Browser Fingerprint Coding Methods Increasing the Effectiveness of User Identification in the Web Traffic

Beyond Cookie Monster Amnesia: Real World Persistent Online Tracking

Characterizing Browser Fingerprinting and its Mitigations

Unveiling the Digital Fingerprints: Analysis of Internet attacks based on website fingerprints

Fine-Grained Webpage Fingerprinting Using Only Packet Length Information of Encrypted Traffic

Advancing Web Browser Forensics: Critical Evaluation of Emerging Tools and Techniques

FP-Inconsistent: Detecting Evasive Bots using Browser Fingerprint Inconsistencies

A Survey of Browser Fingerprint Research and Application

New Approaches to Website Fingerprinting Defenses

Website Fingerprinting Through the Cache Occupancy Channel and its Real World Practicality

Our fingerprints don't fade from the Apps we touch: Fingerprinting the Android WebView

Nowhere to Hide: Detecting Obfuscated Fingerprinting Scripts

The Development of a Data Collection and Browser Fingerprinting System

Security Survey of Internet Browsers Data Managers

I Know Why You Went to the Clinic: Risks and Realization of HTTPS Traffic Analysis

Assessing Web Fingerprinting Risk