Abstract:This paper studies how to approximate pufferfish privacy when the adversary's prior belief of the published data is Gaussian distributed. Using Monge's optimal transport plan, we show that $(\epsilon, \delta)$-pufferfish privacy is attained if the additive Laplace noise is calibrated to the differences in mean and variance of the Gaussian distributions conditioned on every discriminative secret pair. A typical application is the private release of the summation (or average) query, for which sufficient conditions are derived for approximating $\epsilon$-statistical indistinguishability in individual's sensitive data. The result is then extended to arbitrary prior beliefs trained by Gaussian mixture models (GMMs): calibrating Laplace noise to a convex combination of differences in mean and variance between Gaussian components attains $(\epsilon,\delta)$-pufferfish privacy.

What problem does this paper attempt to address?

The main problem that this paper attempts to solve is how to achieve pufferfish privacy protection for Gaussian prior distributions by adding appropriate Laplace noise during the data release process. Specifically, when the attacker's prior belief is a Gaussian distribution, the paper studies how to calibrate Laplace noise to achieve $(\epsilon, \delta)$-pufferfish privacy protection. The main contributions include: 1. **For Gaussian - distributed data with all secret instances given**: Using Monge's optimal transport plan, it is proved that $(\epsilon, \delta)$-pufferfish privacy protection can be achieved by adding Laplace noise. The scale parameter $b$ of the Laplace noise should be calibrated according to the mean and variance differences of each secret pair $(s_i, s_j)$. 2. **Privatization of sum queries in multi - user systems**: Applying the above results, sufficient conditions for privatizing sum queries in multi - user systems are derived to ensure that each participant's data is statistically indistinguishable. 3. **GMM model for arbitrary prior distributions**: Assuming that the attacker has learned prior knowledge of an arbitrary distribution through the Gaussian Mixture Model (GMM), it is proved that $(\epsilon, \delta)$-pufferfish privacy protection can be achieved by calibrating the scale parameter $b$ of the Laplace noise to the convex combination of the Gaussian component means and variance differences. ### Formula Summary - **Calibration of the scale parameter of Laplace noise**: \[ b \geq \frac{1}{\epsilon} \max_{\rho, (s_i, s_j) \in S} \left( | \mu_i - \mu_j | + \tau^*(\delta) | \sigma_i - \sigma_j | \right) \] where $\tau^*(\delta) = \min \{ \tau : \Pr(Z > \tau) \leq \frac{\delta}{2} \}$ or $\tau^*(\delta) = Q^{-1}(\frac{\delta}{2})$, and $Q(t)$ is the tail probability of the standard normal distribution. - **Sum queries in multi - user systems**: \[ b \geq \frac{1}{\epsilon} \max_{k \in K} \left( | \mu_k | + \Delta \sigma_k \tau^*(\delta) \right) \] where $\Delta \sigma_k = \sqrt{\sum_{k' \in K - k} \sigma_{k'}^2 + \sigma_k^2} - \sqrt{\sum_{k' \in K - k} \sigma_{k'}^2}$. - **Noise calibration under GMM prior**: \[ b \geq \frac{1}{\epsilon} \max_{\rho, (s_i, s_j) \in S} \sum_{m, l} w_{ml}^* \left( | \mu_{im} - \mu_{jl} | + \tau^*(\delta) | \sigma_{im} - \sigma_{jl} | \right) \] ### Experimental Verification The paper conducted experiments on the Adult and Hungarian Heart Disease datasets in the UCI Machine Learning Repository to verify the effectiveness of the proposed method. The experimental results show that by appropriately calibrating Laplace noise, the required privacy protection level can be achieved while maintaining data utility.

Approximation of Pufferfish Privacy for Gaussian Priors

Approximation of Pufferfish Privacy for Gaussian Priors

Rényi Pufferfish Privacy: General Additive Noise Mechanisms and Privacy Amplification by Iteration

General Inferential Limits Under Differential and Pufferfish Privacy

Count on Your Elders: Laplace vs Gaussian Noise

Pufferfish Privacy: An Information-Theoretic Study

Locally Private Gaussian Estimation

Privacy Guarantees in Posterior Sampling under Contamination

Private Estimation with Public Data

Optimal Noise-Adding Mechanism in Additive Differential Privacy

Pufferfish Privacy Mechanisms for Correlated Data

Generalized Gaussian Mechanism for Differential Privacy

Differential Privacy with Higher Utility by Exploiting Coordinate-wise Disparity: Laplace Mechanism Can Beat Gaussian in High Dimensions

Privately Learning Mixtures of Axis-Aligned Gaussians

General Gaussian Noise Mechanisms and Their Optimality for Unbiased Mean Estimation

Sample-Efficient Private Learning of Mixtures of Gaussians

Improving Utility for Privacy-Preserving Analysis of Correlated Columns using Pufferfish Privacy

Privacy Amplification for the Gaussian Mechanism via Bounded Support

Better Gaussian Mechanism using Correlated Noise

On the Privacy of Selection Mechanisms with Gaussian Noise

Tractable MCMC for Private Learning with Pure and Gaussian Differential Privacy