Abstract:Human mobility trajectories are increasingly collected by ISPs to assist academic research and commercial applications. Meanwhile, there is a growing concern that individual trajectories can be de-anonymized when the data is shared, using information from external sources (e.g., online social networks). To understand this risk, prior works either estimate the theoretical privacy bound or simulate de-anonymization attacks on synthetically created datasets. However, it is not clear how well the theoretical estimations are preserved in practice. In this article, we collected a large-scale ground-truth trajectory dataset from 2,161,500 users of a cellular network, and two matched external trajectory datasets from a large social network (56,683 users) and a check-in/review service (45,790 users) on the same user population. The two sets of large ground-truth data provide a rare opportunity to extensively evaluate a variety of de-anonymization algorithms (nine in total). We find that their performance in the real-world dataset is far from the theoretical bound. Further analysis shows that most algorithms have under-estimated the impact of spatio-temporal mismatches between the data from different sources, and the high sparsity of user generated data also contributes to the under-performance. Based on these insights, we propose four new algorithms that are specially designed to tolerate spatial or temporal mismatches (or both) and model location contexts and time contexts. Extensive evaluations show that our algorithms achieve more than 17 percent performance gain over the best existing algorithms, confirming our insights. Further, we propose two new location-privacy preserving mechanisms utilizing the spatio-temporal mismatches to better protect users' privacy against the de-anonymization attack. Evaluation results show that our proposed mechanisms can reduce the performance of de-anonymization attacks by over 8.0 percent, demonstrating the effectivene-s of our insights.

No More Than What I Post: Preventing Linkage Attacks on Check-in Services

A Clustering-Based Location Privacy Protection Scheme for Pervasive Computing.

PLAM: A Privacy-Preserving Framework for Local-Area Mobile Social Networks.

X-Region: A Framework for Location Privacy Preservation in Mobile Peer-to-peer Networks

Privcheck: Privacy-Preserving Check-In Data Publishing For Personalized Location Based Services

Feel Free to Check-in: Privacy Alert against Hidden Location Inference Attacks in GeoSNs

Preventing Location-based Inference Attack in Location Based Services

Asynchronous Side Information Attack from the Edge: an Approach to Identify Participants from Anonymous Mobility Traces.

Where have you been? A Study of Privacy Risk for Point-of-Interest Recommendation

Discovering People's Life Patterns from Anonymized WiFi Scanlists

Understanding Motivations Behind Inaccurate Check-ins.

Walking without Friends: Publishing Anonymized Trajectory Dataset without Leaking Social Relationships

Anonymization and De-Anonymization of Mobility Trajectories: Dissecting the Gaps Between Theory and Practice

Check in or Not? A Stochastic Game for Privacy Preserving in Point-of-Interest Recommendation System

Defending Malicious Check-In Using Big Data Analysis of Indoor Positioning System: An Access Point Selection Approach

Anonymity and Historical-Anonymity in Location-Based Services

A Location Privacy Preserving Algorithm Based on Linkage Protection

Using dynamic pseudo-IDs to protect privacy in location-based services

You Can Hide, But Your Periodic Schedule Can'T

Ad-hoc Anonymity: Privacy Preservation for Location-based Services in Mobile Networks

Data De-anonymization : From Mobility Traces to On-line Social Networks