TransRisk: Mobility Privacy Risk Prediction based on Transferred Knowledge

Xiaoyang Xie,Zhiqing Hong,Zhou Qin,Zhihan Fang,Yuan Tian,Desheng Zhang
DOI: https://doi.org/10.1145/3534581
2022-01-01
Abstract:Human mobility data may lead to privacy concerns because a resident can be re-identi.ed from these data by malicious attacks even with anonymized user IDs. For an urban service collecting mobility data, an e.cient privacy risk assessment is essential for the privacy protection of its users. The existing methods enable e.cient privacy risk assessments for service operators to fast adjust the quality of sensing data to lower privacy risk by using prediction models. However, for these prediction models, most of them require massive training data, which has to be collected and stored.rst. Such a large-scale long-term training data collection contradicts the purpose of privacy risk prediction for new urban services, which is to ensure that the quality of high-risk human mobility data is adjusted to low privacy risk within a short time. To solve this problem, we present a privacy risk prediction model based on transfer learning, i.e., TransRisk, to predict the privacy risk for a new target urban service through (1) small-scale short-term data of its own, and (2) the knowledge learned from data from other existing urban services. We envision the application of TransRisk on the tra.c camera surveillance system and evaluate it with real-world mobility datasets already collected in a Chinese city, Shenzhen, including four source datasets, i.e., (i) one call detail record dataset (CDR) with 1.2 million users; (ii) one cellphone connection data dataset (CONN) with 1.2 million users; (iii) a vehicular GPS dataset (Vehicles) with 10 thousand vehicles; (iv) an electronic toll collection transaction dataset (ETC) with 156 thousand users, and a target dataset, i.e., a camera dataset (Camera) with 248 cameras. The results show that our model outperforms the state-of-the-art methods in terms of RMSE and MAE. Our work also provides valuable insights and implications on mobility data privacy risk assessment for both current and future large-scale services.
What problem does this paper attempt to address?