Tighter Privacy Auditing of DP-SGD in the Hidden State Threat Model

Tudor Cebere,Aurélien Bellet,Nicolas Papernot
2024-10-15
Abstract:Machine learning models can be trained with formal privacy guarantees via differentially private optimizers such as DP-SGD. In this work, we focus on a threat model where the adversary has access only to the final model, with no visibility into intermediate updates. In the literature, this hidden state threat model exhibits a significant gap between the lower bound from empirical privacy auditing and the theoretical upper bound provided by privacy accounting. To challenge this gap, we propose to audit this threat model with adversaries that \emph{craft a gradient sequence} designed to maximize the privacy loss of the final model without relying on intermediate updates. Our experiments show that this approach consistently outperforms previous attempts at auditing the hidden state model. Furthermore, our results advance the understanding of achievable privacy guarantees within this threat model. Specifically, when the crafted gradient is inserted at every optimization step, we show that concealing the intermediate model updates in DP-SGD does not amplify privacy. The situation is more complex when the crafted gradient is not inserted at every step: our auditing lower bound matches the privacy upper bound only for an adversarially-chosen loss landscape and a sufficiently large batch size. This suggests that existing privacy upper bounds can be improved in certain regimes.
Machine Learning,Cryptography and Security
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to conduct more accurate privacy audits of differentially private stochastic gradient descent (DP - SGD) in the "hidden state" threat model. Specifically, researchers are concerned with how to evaluate and improve the privacy guarantees of DP - SGD when only the final model is made public and intermediate updates are not visible. Existing privacy auditing methods have a significant gap between the theoretical upper limit and the actual auditing results in this model. To narrow this gap, the paper proposes a new auditing method by designing an adversary that can generate a gradient sequence that maximizes the privacy loss of the final model. This method does not rely on the visibility of intermediate models, thus providing more stringent privacy auditing results. ### Main Contributions 1. **Propose a new auditing method**: The paper proposes gradient - crafting adversaries, which can generate gradient sequences without relying on intermediate models to maximize the privacy loss of the final model. 2. **Improve privacy auditing results**: Experiments show that the new method has better auditing results than previous methods in the "hidden state" threat model and can match the existing privacy upper limit in some cases. 3. **Reveal the privacy amplification phenomenon**: The study found that under certain conditions, even if the intermediate model is not made public, the privacy loss will not be amplified. But under other conditions, especially when the batch size is relatively small compared to the noise variance, there is still a privacy amplification phenomenon, although this effect is weaker than that in convex problems. ### Method Overview - **Gradient - designing adversaries**: The adversary directly designs the gradient sequence instead of designing data points (canary). This can avoid inaccurate auditing results caused by insufficient data point design. - **Two specific adversary examples**: - **Randomly Biased Dimension (AGC - R)**: The adversary randomly selects a dimension and generates a gradient with the maximum magnitude in this dimension. - **Simulated Biased Dimension (AGC - S)**: The adversary simulates the training process, selects the dimension with the least update, and generates a gradient with the maximum magnitude in this dimension. - **Experimental verification**: Experiments were carried out on the CIFAR10 and Housing datasets, and the effectiveness of the new method was verified using different models (such as convolutional neural networks, residual networks, and fully - connected neural networks). ### Conclusion By proposing new gradient - designing adversaries, the paper successfully provides more stringent privacy auditing results in the "hidden state" threat model. These results not only help to understand the privacy characteristics of DP - SGD in this model but also lay the foundation for further improving privacy accounting techniques.