Differentially Private Spatiotemporal Trajectory Synthesis with Retained Data Utility

Yuqing Ge,Yunsheng Wang,Nana Wang
2024-08-23
Abstract:Spatiotemporal trajectories collected from GPS-enabled devices are of vital importance to many applications, such as urban planning and traffic analysis. Due to the privacy leakage concerns, many privacy-preserving trajectory publishing methods have been proposed. However, most of them could not strike a good balance between privacy protection and good data utility. In this paper, we propose DP-STTS, a differentially private spatiotemporal trajectory synthesizer with high data utility, which employs a model composed of a start spatiotemporal cube distribution and a 1-order Markov process. Specially, DP-STTS firstly discretizes the raw spatiotemporal trajectories into neighboring cubes, such that the model size is limited and the model's tolerance for noise could be enhanced. Then, a Markov process is utilized for the next location point picking. After adding noise under differential privacy (DP) to the model, synthetic trajectories that preserve essential spatial and temporal characteristics of the real trajectories are generated from the noisy model. Experiments on one real-life dataset demonstrate that DP-STTS provides good data utility. Our code is available at <a class="link-external link-https" href="https://github.com/Etherious72/DP-STTS" rel="external noopener nofollow">this https URL</a>.
Cryptography and Security
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is **how to generate spatiotemporal trajectories with high data utility under the premise of protecting privacy**. Specifically, the paper focuses on how to protect individual privacy and maintain the usefulness of data when releasing spatiotemporal trajectory data collected by GPS devices. ### Problem Background With the popularization of GPS devices, more and more personal trajectory data are generated and collected. These trajectory data are of great significance for applications such as traffic analysis, urban planning, navigation, and route recommendation. However, due to the sensitivity of trajectory data, directly releasing these data may disclose personal privacy. Therefore, methods that can protect privacy and maintain data utility need to be developed. ### Deficiencies of Existing Methods Traditional trajectory privacy - protection techniques (such as k - anonymization, false - location generation, sensitive - area calculation, or new - trajectory generation based on deep learning) can protect privacy to a certain extent, but they have the following problems: - **Unable to provide strict privacy - protection proof**: These methods are vulnerable to background - knowledge attacks (such as linkage attacks and probability attacks). - **Difficult to balance privacy protection and data utility**: Existing methods often reduce the availability of data while increasing privacy protection. ### The Method Proposed in the Paper To overcome the above problems, the paper proposes **DP - STTS (Differentially Private Spatiotemporal Trajectory Synthesizer)**, which is a spatiotemporal trajectory synthesizer based on differential privacy and aims to provide provable privacy protection and high data utility. The specific practices of DP - STTS include: 1. **Spatial and Temporal Discretization**: Discretize the original spatiotemporal trajectory into adjacent spatiotemporal cubes to limit the model size and enhance the model's tolerance to noise. 2. **Model Construction**: Use the starting - cube distribution and the first - order Markov process to model the trajectory. The starting - cube distribution is used to describe the starting position and time of each trajectory, and the Markov process is used for the selection of subsequent position points. 3. **Add Differential Privacy Noise**: Add noise to the model to ensure differential privacy. 4. **Synthetic Trajectory Generation**: Generate synthetic trajectories from the noise - added model, and these trajectories retain the spatial and temporal characteristics of the real trajectories. ### Experimental Results The experimental results show that DP - STTS is superior to existing methods (such as DP - MODR) in multiple evaluation indicators, especially in maintaining the time - access - frequency distribution of the trajectory, the average relative error of location - access, and the Kendall - tau coefficient of frequent patterns. This indicates that DP - STTS can better retain the useful information of trajectory data while protecting privacy. ### Summary By introducing DP - STTS, the paper solves the problem of generating spatiotemporal trajectories with high data utility under the premise of protecting privacy. The experimental results verify the effectiveness of this method and provide a new solution for the privacy protection and utility maintenance of spatiotemporal trajectory data. ### Summary of Key Formulas 1. **Probability Transfer Formula of the First - Order Markov Process**: \[ P_r[\sigma_{j + 1}=\sigma|\sigma_0\cdots\sigma_j]=P_r[\sigma_{j + 1}=\sigma|\sigma_j] \] where \(\sigma\) is the next position point and \(\sigma_j\) is the current position point. 2. **Definition of Differential Privacy**: \[ P_r[M(\mathcal{D}_1)=\Omega]\leq e^\epsilon\times P_r[M(\mathcal{D}_2)=\Omega] \] where \(\mathcal{D}_1\) and \(\mathcal{D}_2\) are two adjacent datasets and \(\epsilon\) is the privacy budget. 3. **Laplace Mechanism**: \[ M(D)=\phi(D)+Lap\left(\frac{\Delta\phi}{\epsilon}\right) \]