Abstract:Local differential privacy (LDP) has recently emerged as a popular privacy standard. With the growing popularity of LDP, several recent works have applied LDP to spatial data, and grid-based decompositions have been a common building block in the collection and analysis of spatial data under DP and LDP. In this paper, we study three grid-based decomposition methods for spatial data under LDP: Uniform Grid (UG), PrivAG, and AAG. UG is a static approach that consists of equal-sized cells. To enable data-dependent decomposition, PrivAG was proposed by Yang et al. as the most recent adaptive grid method. To advance the state-of-the-art in adaptive grids, in this paper we propose the Advanced Adaptive Grid (AAG) method. For each grid cell, following the intuition that the cell's intra-cell density distribution will be affected by its neighbors, AAG performs uneven cell divisions depending on the neighboring cells' densities. We experimentally compare UG, PrivAG, and AAG using three real-world location datasets, varying privacy budgets, and query sizes. Results show that AAG provides higher utility than PrivAG, demonstrating the superiority of our proposed approach. Furthermore, UG's performance is heavily dependent on the choice of grid size. When the grid size is chosen optimally in UG, AAG still beats UG for small queries, but UG beats AAG for large (coarse-grained) queries.
What problem does this paper attempt to address?
This paper attempts to solve the problem of how to effectively perform grid decomposition on spatial data within the framework of Local Differential Privacy (LDP). Specifically, the authors studied three grid - based decomposition methods: Uniform Grid (UG), PrivAG, and Advanced Adaptive Grid (AAG), in order to improve the utility of spatial data analysis and ensure user privacy.
### Problem Background
With the popularization of smart phones, connected cars, Location - Based Services (LBS), and social networks, a large amount of spatial data has been collected and analyzed. These data are of great value and can be used to provide better products and services. However, spatial data contains sensitive information, such as users' home addresses, workplaces, frequently visited places, and personal habits. Therefore, it is crucial to protect the privacy of these data.
Local Differential Privacy (LDP) is a popular privacy protection standard. It allows users to perturb their data locally before uploading it to the server, so there is no need to trust third - party aggregators. In order to process spatial data within the LDP framework, it is usually necessary to discretize continuous spatial data into a finite domain. For this reason, grid decomposition methods are widely used in tasks such as trajectory collection, range query answering, and synthetic data generation.
### Research Objectives
The main objective of this paper is to propose a new adaptive grid method - Advanced Adaptive Grid (AAG) - to improve the existing adaptive grid methods (such as PrivAG). Specifically, the authors hope to enhance the utility of spatial data analysis in the following ways:
1. **Introducing Uneven Partitioning**: Different from the uniform partitioning in PrivAG, AAG performs uneven partitioning according to the density distribution of adjacent cells, so that high - density areas have more small cells and low - density areas have fewer large cells.
2. **Optimizing Parameter Selection**: Through experiments, it was found that the parameter values recommended in PrivAG result in the generated adaptive grid being very similar to the initial uniform grid, unable to fully utilize the advantages of the adaptive grid. Therefore, AAG proposes a new parameter selection strategy to increase the number of cells in high - density areas while avoiding over - partitioning.
3. **Handling Edge and Corner Cells**: For cells located at the edge or corner of the grid, directly using zero density due to the lack of neighbors will lead to incorrect results. AAG solves this problem by using the density of the current cell itself to replace the density of the missing neighbors.
### Experimental Verification
The authors conducted experiments using three real - world location datasets (Gowalla, Porto, and Foursquare) and compared the performance of UG, PrivAG, and AAG under different privacy budgets (ε) and query sizes (ρ). The experimental results show that:
- **AAG is Superior to PrivAG**: Under all privacy budgets and query sizes, the Average Query Error (AQE) of AAG is lower than that of PrivAG, proving the effectiveness of AAG.
- **UG is Suitable for Large Queries**: When the query scale is large, the performance of UG is better than that of AAG, especially when calculating coarse - grained statistical information.
- **AAG is Suitable for Small Queries**: For small - scale queries, AAG provides higher utility, especially when the privacy budget is low.
### Conclusions
The main contributions of this paper include:
1. Proposing a new adaptive grid method AAG, which improves the utility of spatial data analysis by considering the density distribution of adjacent cells for uneven partitioning.
2. Experiments prove that AAG is superior to the existing adaptive grid method PrivAG in most cases and performs particularly well in small - scale queries.
3. Pointing out that UG still has advantages in handling large - scale queries, providing a basis for selection in different application scenarios.
Through these improvements, AAG not only improves the accuracy of spatial data analysis but also ensures the security of user privacy.