Your Trajectory Privacy Can Be Breached Even if You Walk in Groups

Kaixin Sui,Youjian Zhao,Dapeng Liu,Minghua Ma,Lei Xu,Li Zimu,Dan Pei
DOI: https://doi.org/10.1109/iwqos.2016.7590444
2016-01-01
Abstract:The enterprise Wi-Fi networks enable the collection of large-scale users' mobility information at an indoor level. The collected trajectory data is very valuable for both research and commercial purposes, but the use of the trajectory data also raises serious privacy concerns. A large body of work tries to achieve k-anonymity (hiding each user in an anonymity set no smaller than k) as the first step to solve the privacy problem. Yet it has been qualitatively recognized that k-anonymity is still risky when the diversity of the sensitive information in the k-anonymity set is low. There, however, still lacks a study that provides a quantitative understanding of that risk in the trajectory dataset. In this work, we present a large-scale measurement based analysis of the low-diversity risk over four weeks of trajectory data collected from Tsinghua, a campus that covers an area of 4 km 2 , on which 2,670 access points are deployed in 111 buildings. Using this dataset, we highlight the high risk of the low diversity. For example, we find that even when 5-anonymity is satisfied, the sensitive attributes of 25% of individuals can be easily guessed. We also find that although a larger k increases the size of anonymity sets, the corresponding improvement on the diversity of anonymity sets is very limited (decayed exponentially). These results suggest that diversity-oriented solutions are necessary.
What problem does this paper attempt to address?