PPA-DBSCAN: Privacy-preserving ρ-Approximate Density-based Clustering

Jiaxuan Fu,Ke Cheng,Zhao Chang,Yulong Shen
DOI: https://doi.org/10.1109/tdsc.2024.3375347
2024-01-01
IEEE Transactions on Dependable and Secure Computing
Abstract:Clustering is widely used for data analysis that partitions a set of data into multiple clusters, where objects in the same cluster have similar properties. Data for clustering analysis often comes from different data sources, which makes it important to maintain data privacy. However, existing privacypreserving clustering schemes either require the support of prior knowledge or are just applicable for small datasets due to impractical costs. To solve this issue, we follow a classical approximate DBSCAN clustering algorithm and adapt it to the privacy-preserving context. Concretely, to construct our secure approximate clustering algorithm, we propose a series of basic secure computation protocols among additively secret-shared values. In addition, we design a crypto-friendly grid partitioning method based on which an efficient and privacy-preserving approximation DBSCAN scheme is derived. Theoretical analysis and experimental results show that our scheme achieves almost the same cluster quality compared to the plain-text exact DBSCAN. Our extensive experiments on different datasets demonstrate that our scheme is accurate and efficient.
computer science, information systems, software engineering, hardware & architecture
What problem does this paper attempt to address?