Abstract:Being able to reconstruct training data from the parameters of a neural network is a major privacy concern. Previous works have shown that reconstructing training data, under certain circumstances, is possible. In this work, we analyse such reconstructions empirically and propose a new formulation of the reconstruction as a solution to a bilevel optimisation problem. We demonstrate that our formulation as well as previous approaches highly depend on the initialisation of the training images $x$ to reconstruct. In particular, we show that a random initialisation of $x$ can lead to reconstructions that resemble valid training samples while not being part of the actual training dataset. Thus, our experiments on affine and one-hidden layer networks suggest that when reconstructing natural images, yet an adversary cannot identify whether reconstructed images have indeed been part of the set of training samples.

What problem does this paper attempt to address?

### What problem does this paper attempt to solve? This paper mainly focuses on **the privacy issue of reconstructing training data from neural network parameters**. Specifically, the authors have studied how to attempt to reconstruct (partially or fully) the training data when only the final trained weights of the neural network (i.e., model parameters) are known. The importance of this problem lies in that if the training data can be successfully reconstructed, then even if the training data contains sensitive information, this information may be leaked. #### Main research contents: 1. **Proposal of the bi - level optimization problem**: - The authors model the training data reconstruction problem as a **bi - level optimization problem**, where the upper - level optimization objective is to make the reconstructed data as similar as possible to the original training data, and the lower - level optimization objective is to ensure that these data can reproduce the training process by minimizing the loss function. - The specific bi - level optimization formula is: \[ \min_{x, y} l(\theta^*, \theta(x, y)) \quad \text{s.t.} \quad \theta(x, y) = \arg\min_\theta \frac{1}{m} \sum_{i = 1}^m L(\Phi(x_i; \theta), y_i) \] where $ \theta^* $ is the known trained model parameter, $ \theta(x, y) $ is the model parameter obtained by training with the reconstructed data $ (x, y) $, $ L $ is the loss function, and $ \Phi $ is the neural network model. 2. **Influence of initial conditions**: - The authors have found through experiments that the reconstruction results highly depend on the initial conditions of the input data $ x $. Different initial conditions may lead to completely different reconstruction results. - Experiments show that randomly initialized $ x $ may generate seemingly reasonable images, but these images are not necessarily samples in the original training set. This means that even if images that look like training data can be reconstructed, it is impossible to determine whether these images are the real training data. 3. **Possibility of privacy protection**: - Based on the above findings, the authors believe that due to the uncertainty in the reconstruction process (i.e., it is impossible to determine whether the reconstructed images belong to the original training set), this actually provides a certain degree of privacy protection. Even if an attacker can reconstruct some seemingly reasonable images, they cannot confirm whether these images are from the training set. 4. **Comparison with other methods**: - The authors also compared their bi - level optimization method with other previously proposed reconstruction methods (such as the DecoReco method) and found that different methods perform similarly under initial conditions, further verifying the important influence of initial conditions on reconstruction results. #### Conclusion: The main contribution of this paper is to reveal that the reconstruction of training data from neural network parameters is highly dependent on initial conditions and to point out that this uncertainty can provide a certain degree of privacy protection. Although some seemingly reasonable images can be reconstructed by certain methods, these images are not necessarily part of the original training data, so it is difficult for attackers to determine which images are the real training data.

Training Data Reconstruction: Privacy due to Uncertainty?

On the Reconstruction of Training Data from Group Invariant Networks

Exploring the Security Boundary of Data Reconstruction via Neuron Exclusivity Analysis

Reconstructing Training Data From Real World Models Trained with Transfer Learning

Bounding Reconstruction Attack Success of Adversaries Without Data Priors

Understanding Training-Data Leakage from Gradients in Neural Networks for Image Classification

Reconstructing Training Data from Model Gradient, Provably

Bounding Training Data Reconstruction in DP-SGD

Data Reconstruction Attacks and Defenses: A Systematic Evaluation

Network Inversion for Training-Like Data Reconstruction

Reconstruction Attacks on Machine Unlearning: Simple Models are Vulnerable

Reconciling privacy and accuracy in AI for medical imaging

Understanding Reconstruction Attacks with the Neural Tangent Kernel and Dataset Distillation

Deconstructing Data Reconstruction: Multiclass, Weight Decay and General Losses

Protection Against Reconstruction and Its Applications in Private Federated Learning

Provable Privacy Attacks on Trained Shallow Neural Networks

Adversarial Learning of Privacy-Preserving and Task-Oriented Representations

Only My Model On My Data: A Privacy Preserving Approach Protecting one Model and Deceiving Unauthorized Black-Box Models

Visual Privacy Auditing with Diffusion Models

Recover User's Private Training Image Data by Gradient in Federated Learning

Uncertainty-Aware Null Space Networks for Data-Consistent Image Reconstruction