Optimal Defenses Against Gradient Reconstruction Attacks

Yuxiao Chen,Gamze Gürsoy,Qi Lei
2024-11-06
Abstract:Federated Learning (FL) is designed to prevent data leakage through collaborative model training without centralized data storage. However, it remains vulnerable to gradient reconstruction attacks that recover original training data from shared gradients. To optimize the trade-off between data leakage and utility loss, we first derive a theoretical lower bound of reconstruction error (among all attackers) for the two standard methods: adding noise, and gradient pruning. We then customize these two defenses to be parameter- and model-specific and achieve the optimal trade-off between our obtained reconstruction lower bound and model utility. Experimental results validate that our methods outperform Gradient Noise and Gradient Pruning by protecting the training data better while also achieving better utility.
Machine Learning,Artificial Intelligence,Cryptography and Security
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to effectively defend against Gradient Reconstruction Attacks (GRA) in Federated Learning (FL) while minimizing the impact on the model training effect. Specifically, the author aims to optimize the trade - off between data leakage and utility loss. ### Problem Background Federated Learning is a method that allows multiple institutions or devices to collaboratively train a model without centrally storing data. Although this method can protect data privacy to a certain extent, it still faces the risk of Gradient Reconstruction Attacks. These attacks can reconstruct the original training data through the shared gradient information, thereby leaking sensitive information. ### Research Objectives To meet this challenge, the paper proposes two optimized defense mechanisms: 1. **Optimal Gradient Noise**: Perturb the gradient by adding optimal Gaussian noise, making it difficult for attackers to reconstruct the original data from the perturbed gradient. 2. **Optimal Gradient Pruning**: Selectively set unimportant gradients to zero and add a small amount of noise to the remaining gradients to reduce information leakage. ### Main Contributions - **Theoretical Lower Bound**: The author derives a theoretical lower bound of the reconstruction error, which can be used to evaluate the effectiveness of different attack methods. - **Optimization Algorithms**: Proposes two specific defense mechanisms - optimal gradient noise and optimal gradient pruning, which can maximize the lower bound of the reconstruction error at a given utility level. - **Experimental Verification**: Through the experimental results in the image classification task, it is proved that the proposed methods can better maintain the model performance while protecting the training data. ### Formula Summary - **Lower Bound of Reconstruction Error**: \[ B_{D,S} := \min_{R: \mathbb{R}^d \to \mathbb{R}^m} \mathbb{E}_{x \sim D} \mathbb{E}_{y \sim S(g(x))} \| R(y) - x \|^2 \] According to the Bayesian Cramér - Rao Lower Bound, we have: \[ B_{D,S} \geq \frac{d^2}{\mathbb{E}_{x \sim D}[\text{tr}(J_F(x))]} + d\cdot\lambda_1(J_P) \] where \( J_F(x) \) and \( J_P \) are the data information matrix and the prior information matrix respectively. - **Optimal Gradient Noise**: \[ \Sigma_{i,i} = \lambda \frac{\mathbb{E}_{x \sim D} \|\nabla_x g_i(x)\|^2}{\mathbb{E}_{x \sim D} g_i(x)^2} \] - **Optimal Gradient Pruning**: \[ k_i = \frac{\mathbb{E}_{x \sim D} \|\nabla_x g_i(x)\|^2}{\mathbb{E}_{x \sim D} g_i(x)^2} \] Through these methods, the paper successfully optimizes the privacy - utility trade - off in Federated Learning and provides stronger defense capabilities.