Connecting the Dots: User Privacy is Not Preserved in ID-Removed Cellular Data

Fengli Xu,Zhen Tu,Yong Li
DOI: https://doi.org/10.1109/tnsm.2019.2926488
2020-01-01
IEEE Transactions on Network and Service Management
Abstract:Large scale cellular network accessing records are generated by mobile users on daily basis, which leave fine-grained footprints that have potential to compromise the privacy of user mobility. Abundant previous researches have demonstrated that simple anonymization has limited effect in preserving user’s privacy due to prevalence of re-identification attacks. As a result, the mobile operators and application vendors usually turn to a more aggressive solution that is removing the identifier (ID) of each entry in the records. Cellular data sets owners believe that such procedure is sufficient for preserving user’s privacy, since the attackers cannot directly put together the cellular records that belong to one individual, let alone recover user’s identity. However, in this paper, we argue and prove that simply removing the IDs is not sufficient for preserving mobile users’ privacy. We develop a mechanism that is able to extract the mobility patterns of users from cellular records and associate ID-removed records belonging to same individuals. Extensive experiments show that 70%~80% records of each user on average can be accurately recovered for two data sets collected from both mobile application and mobile operator side at the scale of several thousands to tens of thousands users. We find that the number of users released, the temporal and spatial granularity, and the speed of user movement are key factors that determine the privacy leakage in the ID-removed cellular data sets.
What problem does this paper attempt to address?