Differentially Private Node Similarity Matrix Release for Large-Scale Social Networks

Chao Xu,Chao Shan,Ning Wu,Yunfeng Zou
DOI: https://doi.org/10.1109/ICCCS52626.2021.9449152
2021-04-23
Abstract:Node similarity, which is usually released in the form of a matrix, provides the basis for network analysis. However, directly releasing node similarity matrix may entail privacy risk. More concretely, the sensitive network structure implied in node similarity potentially poses threats to individual privacy in the original network. For large-scale social networks, the problem with network analysis is to extract effective node similarity matrix on the premise of preserving privacy. Privacy-preserving matrix factorization (MF) is a valid means to address this problem. Existing methods using differentially private MF, however, have two limitations. First, existing methods incur a complex analytic calculation when reaping the gradient of the objective function of MF, which results in poor scalability. Second, existing methods typically use private gradient descent as a solver of MF yet require adding plenty of noise to each gradient, swamping the signal provided by the gradient and rendering poor overall result quality. Motivated by this, we propose GPMF, a generic differential privacy matrix factorization method for node similarity matrix release. Specifically, to uniformly calculate the gradient, we design a provable gradient estimation method, which theoretically reveals the intrinsic relationship between actual and estimated gradients whose error can be preset for optimization, liberating the form of function. Further, to achieve optimal gradient error without compromising privacy, we devise a private Langevin Monte Carlowe (LMC), which uses the Gaussian noise that satisfies differential privacy instead of the noise term in LMC, eliminating the damage of extra noise on the gradient while preserving the inner property of LMC. Theoretical analysis is provided to ensure that GPMF yields processed node similarity matrix with high network structure utility while obeying (ε, δ)-differential privacy. Simulations confirm the effectiveness and efficiency of GPMF.
Computer Science
What problem does this paper attempt to address?