Christina Runkel,Kanchana Vaishnavi Gandikota,Jonas Geiping,Carola-Bibiane Schönlieb,Michael Moeller
Abstract:Being able to reconstruct training data from the parameters of a neural network is a major privacy concern. Previous works have shown that reconstructing training data, under certain circumstances, is possible. In this work, we analyse such reconstructions empirically and propose a new formulation of the reconstruction as a solution to a bilevel optimisation problem. We demonstrate that our formulation as well as previous approaches highly depend on the initialisation of the training images $x$ to reconstruct. In particular, we show that a random initialisation of $x$ can lead to reconstructions that resemble valid training samples while not being part of the actual training dataset. Thus, our experiments on affine and one-hidden layer networks suggest that when reconstructing natural images, yet an adversary cannot identify whether reconstructed images have indeed been part of the set of training samples.
What problem does this paper attempt to address?
### What problem does this paper attempt to solve?
This paper mainly focuses on **the privacy issue of reconstructing training data from neural network parameters**. Specifically, the authors have studied how to attempt to reconstruct (partially or fully) the training data when only the final trained weights of the neural network (i.e., model parameters) are known. The importance of this problem lies in that if the training data can be successfully reconstructed, then even if the training data contains sensitive information, this information may be leaked.
#### Main research contents:
1. **Proposal of the bi - level optimization problem**:
- The authors model the training data reconstruction problem as a **bi - level optimization problem**, where the upper - level optimization objective is to make the reconstructed data as similar as possible to the original training data, and the lower - level optimization objective is to ensure that these data can reproduce the training process by minimizing the loss function.
- The specific bi - level optimization formula is:
\[
\min_{x, y} l(\theta^*, \theta(x, y)) \quad \text{s.t.} \quad \theta(x, y) = \arg\min_\theta \frac{1}{m} \sum_{i = 1}^m L(\Phi(x_i; \theta), y_i)
\]
where \( \theta^* \) is the known trained model parameter, \( \theta(x, y) \) is the model parameter obtained by training with the reconstructed data \( (x, y) \), \( L \) is the loss function, and \( \Phi \) is the neural network model.
2. **Influence of initial conditions**:
- The authors have found through experiments that the reconstruction results highly depend on the initial conditions of the input data \( x \). Different initial conditions may lead to completely different reconstruction results.
- Experiments show that randomly initialized \( x \) may generate seemingly reasonable images, but these images are not necessarily samples in the original training set. This means that even if images that look like training data can be reconstructed, it is impossible to determine whether these images are the real training data.
3. **Possibility of privacy protection**:
- Based on the above findings, the authors believe that due to the uncertainty in the reconstruction process (i.e., it is impossible to determine whether the reconstructed images belong to the original training set), this actually provides a certain degree of privacy protection. Even if an attacker can reconstruct some seemingly reasonable images, they cannot confirm whether these images are from the training set.
4. **Comparison with other methods**:
- The authors also compared their bi - level optimization method with other previously proposed reconstruction methods (such as the DecoReco method) and found that different methods perform similarly under initial conditions, further verifying the important influence of initial conditions on reconstruction results.
#### Conclusion:
The main contribution of this paper is to reveal that the reconstruction of training data from neural network parameters is highly dependent on initial conditions and to point out that this uncertainty can provide a certain degree of privacy protection. Although some seemingly reasonable images can be reconstructed by certain methods, these images are not necessarily part of the original training data, so it is difficult for attackers to determine which images are the real training data.