Auditing Private Prediction

Karan Chadha,Matthew Jagielski,Nicolas Papernot,Christopher Choquette-Choo,Milad Nasr
2024-02-15
Abstract:Differential privacy (DP) offers a theoretical upper bound on the potential privacy leakage of analgorithm, while empirical auditing establishes a practical lower bound. Auditing techniques exist forDP training algorithms. However machine learning can also be made private at inference. We propose thefirst framework for auditing private prediction where we instantiate adversaries with varying poisoningand query capabilities. This enables us to study the privacy leakage of four private prediction algorithms:PATE [Papernot et al., 2016], CaPC [Choquette-Choo et al., 2020], PromptPATE [Duan et al., 2023],and Private-kNN [Zhu et al., 2020]. To conduct our audit, we introduce novel techniques to empiricallyevaluate privacy leakage in terms of Renyi DP. Our experiments show that (i) the privacy analysis ofprivate prediction can be improved, (ii) algorithms which are easier to poison lead to much higher privacyleakage, and (iii) the privacy leakage is significantly lower for adversaries without query control than thosewith full control.
Cryptography and Security
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to audit private prediction algorithms in machine learning to evaluate the degree of privacy leakage. Specifically, the paper focuses on machine learning models that achieve privacy protection during the inference stage, that is, adding noise during model prediction to meet the requirements of differential privacy (Differential Privacy, DP). Although there are already some methods to audit differential privacy algorithms in the training stage, for private prediction algorithms in the inference stage, there is currently a lack of effective auditing techniques. Therefore, the paper proposes the first auditing framework for private prediction algorithms, aiming to quantify the privacy leakage of these algorithms under different attacker capabilities and query capabilities. ### Main contributions of the paper: 1. **Proposing an auditing framework**: - The paper proposes a systematic framework for auditing the privacy leakage of private prediction algorithms. This framework studies the privacy leakage of four private prediction algorithms (PATE, CaPC, PromptPATE, Private - kNN) by instantiating attackers with different poisoning and query capabilities. 2. **Introducing new auditing techniques**: - To evaluate privacy leakage, the paper introduces new techniques, such as using Rényi Differential Privacy (Rényi Differential Privacy, RDP) to measure privacy leakage and proposing a method to calculate the exact Rényi divergence between adjacent histograms. These techniques enable more accurate evaluation of the privacy leakage of each query and combine the lower bounds of privacy leakage of multiple queries through the lossless RDP composition theorem. 3. **Experimental verification**: - The paper experimentally verifies the effectiveness of the proposed auditing framework. The experimental results show that the privacy analysis of private prediction algorithms can be improved, and algorithms that are more easily poisoned will lead to higher privacy leakage. In addition, for attackers without query control rights, their privacy leakage is significantly lower than that of attackers with full control rights. ### Key techniques and methods: - **Rényi Differential Privacy (RDP)**: - RDP is an alternative definition to ε - δ differential privacy, providing lossless composition properties and is suitable for evaluating the privacy leakage of multiple test query combinations. The definition of RDP is as follows: \[ D_\alpha(P \| Q) := \frac{1}{\alpha - 1} \log \mathbb{E}_{x \sim P} \left( \frac{P(x)}{Q(x)} \right)^\alpha \] - **2 - cut auditing**: - The 2 - cut auditing method provides a lower bound with a hypothesis - testing interpretation by calculating the supremum of the Rényi divergence over all possible output sets. The definition of 2 - cut is as follows: \[ D_\alpha^2(\mu_1 \| \mu_2) := \sup_{O \subseteq \Omega} \frac{1}{\alpha - 1} \log \left( p_1^\alpha p_2^{1-\alpha} + (1 - p_1)^\alpha (1 - p_2)^{1-\alpha} \right) \] where \( p_1 = P(\mu_1 \in O) \) and \( p_2 = P(\mu_2 \in O) \). - **Exact Rényi divergence calculation**: - The paper also proposes a method to calculate the exact Rényi divergence between adjacent histograms, calculating the probability of a given category through the following formula: \[ \text{Pr}[c] = \int_{-\infty}^{\infty} \phi \left( \frac{x - n_c}{\sigma} \right) \prod_{i \neq c} \Phi \left( \frac{x - n_i}{\sigma} \right) dx \] where \(\phi\) and \(\Phi\) are respectively the standard normal distribution