Abstract:Model Inversion (MI) attacks aim to reconstruct private training data by abusing access to machine learning models. Contemporary MI attacks have achieved impressive attack performance, posing serious threats to privacy. Meanwhile, all existing MI defense methods rely on regularization that is in direct conflict with the training objective, resulting in noticeable degradation in model utility. In this work, we take a different perspective, and propose a novel and simple Transfer Learning-based Defense against Model Inversion (TL-DMI) to render MI-robust models. Particularly, by leveraging TL, we limit the number of layers encoding sensitive information from private training dataset, thereby degrading the performance of MI attack. We conduct an analysis using Fisher Information to justify our method. Our defense is remarkably simple to implement. Without bells and whistles, we show in extensive experiments that TL-DMI achieves state-of-the-art (SOTA) MI robustness. Our code, pre-trained models, demo and inverted data are available at:

What problem does this paper attempt to address?

The problem that this paper attempts to solve is an effective defense method against Model Inversion (MI) attacks. MI attacks refer to a privacy threat of reconstructing private training data by abusing the access rights of machine - learning models. Existing MI defense methods usually rely on regularization techniques, which directly conflict with the training objectives, leading to a significant decline in model utility. This paper proposes a new MI defense method based on transfer learning (TL - DMI), aiming to improve the robustness against MI attacks without significantly reducing the model utility. ### Main contributions of the paper 1. **Propose a simple and efficient MI defense method based on transfer learning (TL - DMI)** : This method reduces the number of parameters encoding sensitive information by limiting the number of layers fine - tuned on the private dataset, thereby reducing the performance of MI attacks. 2. **Analyze the importance of model layers in MI tasks for the first time** : Use Fisher Information (FI) to quantify the importance of each layer in MI tasks, and find that the first few layers are crucial for MI tasks, while the last few layers are more important for classification tasks. 3. **Prove through experiments that reducing the number of fine - tuned parameters on the private dataset can improve MI robustness** : The experimental results show that, under the same natural accuracy rate, reducing the number of fine - tuned parameters can significantly reduce the accuracy rate of MI attacks. 4. **Conduct a comprehensive comparison with the existing state - of - the - art MI defense methods** : The results show that TL - DMI not only performs excellently in MI robustness, but also outperforms other methods in maintaining model utility. ### Method overview 1. **Pre - training stage** : Use a public dataset for pre - training and update the parameters of the entire model. 2. **Fine - tuning stage** : Only fine - tune the last few layers of the model on the private dataset and freeze the first few layers to prevent private information from being encoded into these layers. ### Experimental verification - **Layer importance analysis** : Through Fisher Information analysis, it is found that the first few layers of the model are very important for MI tasks, while the last few layers are more important for classification tasks. - **Empirical analysis** : The experimental results show that reducing the number of parameters fine - tuned on the private dataset can significantly improve MI robustness while maintaining the natural accuracy rate of the model. - **Comparison with existing methods** : Under various MI attack settings, TL - DMI outperforms the existing SOTA methods in both MI robustness and model utility. ### Conclusion The TL - DMI method proposed in this paper effectively improves the robustness of the model against MI attacks while maintaining the natural accuracy rate of the model by limiting the number of layers fine - tuned on the private dataset. This method is simple and easy to implement and is applicable to multiple model architectures and MI attack scenarios.

Model Inversion Robustness: Can Transfer Learning Help?

The Secret Revealer: Generative Model-Inversion Attacks Against Deep Neural Networks

A GAN-Based Defense Framework Against Model Inversion Attacks.

NetGuard: Protecting Commercial Web APIs from Model Inversion Attacks Using GAN-generated Fake Samples

Improving Robustness to Model Inversion Attacks via Mutual Information Regularization

Re-thinking Model Inversion Attacks Against Deep Neural Networks

Defending against Model Inversion Attacks via Random Erasing

CALoR: Towards Comprehensive Model Inversion Defense

Model Inversion Attack via Dynamic Memory Learning

Model Inversion Attack against Transfer Learning: Inverting a Model without Accessing It

Privacy Leakage on DNNs: A Survey of Model Inversion Attacks and Defenses

Inversion-guided Defense: Detecting Model Stealing Attacks by Output Inverting

On the Vulnerability of Skip Connections to Model Inversion Attacks

Trap-MID: Trapdoor-based Defense against Model Inversion Attacks

Model Inversion Attacks Through Target-Specific Conditional Diffusion Models

Distributional Black-Box Model Inversion Attack with Multi-Agent Reinforcement Learning

Isolation and Induction: Training Robust Deep Neural Networks against Model Stealing Attacks

Label-Only Model Inversion Attacks via Knowledge Transfer

Boosting Model Inversion Attacks with Adversarial Examples

Model Inversion Attacks: A Survey of Approaches and Countermeasures

MIBench: A Comprehensive Benchmark for Model Inversion Attack and Defense