Can Protective Perturbation Safeguard Personal Data from Being Exploited by Stable Diffusion?

Zhengyue Zhao,Jinhao Duan,Kaidi Xu,Chenan Wang,Rui Zhang,Zidong Du,Qi Guo,Xing Hu
2024-06-24
Abstract:Stable Diffusion has established itself as a foundation model in generative AI artistic applications, receiving widespread research and application. Some recent fine-tuning methods have made it feasible for individuals to implant personalized concepts onto the basic Stable Diffusion model with minimal computational costs on small datasets. However, these innovations have also given rise to issues like facial privacy forgery and artistic copyright infringement. In recent studies, researchers have explored the addition of imperceptible adversarial perturbations to images to prevent potential unauthorized exploitation and infringements when personal data is used for fine-tuning Stable Diffusion. Although these studies have demonstrated the ability to protect images, it is essential to consider that these methods may not be entirely applicable in real-world scenarios. In this paper, we systematically evaluate the use of perturbations to protect images within a practical threat model. The results suggest that these approaches may not be sufficient to safeguard image privacy and copyright effectively. Furthermore, we introduce a purification method capable of removing protected perturbations while preserving the original image structure to the greatest extent possible. Experiments reveal that Stable Diffusion can effectively learn from purified images over all protective methods.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to protect personal data from being unauthorizedly exploited when using the Stable Diffusion model for personalized concept implantation. Specifically, the paper focuses on two main applications: preventing Stable Diffusion from learning an individual's FaceID and learning an artist's style from artworks. With the wide application of Stable Diffusion in the field of artistic creation, these applications have raised concerns about image privacy and copyright infringement. To this end, researchers have explored methods of protecting personal data by adding imperceptible adversarial perturbations to images. However, the effectiveness of these methods has not been fully verified in practical scenarios. Therefore, the paper aims to systematically evaluate the effectiveness of these protection methods in actual threat models and proposes a new purification method, GrIDPure, to remove protective perturbations while preserving the original image structure as much as possible. Experimental results show that existing protection methods may not be sufficient to effectively protect image privacy and copyright, while GrIDPure provides a more effective solution.