Can Protective Perturbation Safeguard Personal Data from Being Exploited by Stable Diffusion?

Zhengyue Zhao,Jinhao Duan,Kaidi Xu,Chenan Wang,Rui Zhang,Zidong Du,Qi Guo,Xing Hu

2024-06-24

Abstract:Stable Diffusion has established itself as a foundation model in generative AI artistic applications, receiving widespread research and application. Some recent fine-tuning methods have made it feasible for individuals to implant personalized concepts onto the basic Stable Diffusion model with minimal computational costs on small datasets. However, these innovations have also given rise to issues like facial privacy forgery and artistic copyright infringement. In recent studies, researchers have explored the addition of imperceptible adversarial perturbations to images to prevent potential unauthorized exploitation and infringements when personal data is used for fine-tuning Stable Diffusion. Although these studies have demonstrated the ability to protect images, it is essential to consider that these methods may not be entirely applicable in real-world scenarios. In this paper, we systematically evaluate the use of perturbations to protect images within a practical threat model. The results suggest that these approaches may not be sufficient to safeguard image privacy and copyright effectively. Furthermore, we introduce a purification method capable of removing protected perturbations while preserving the original image structure to the greatest extent possible. Experiments reveal that Stable Diffusion can effectively learn from purified images over all protective methods.

Computer Vision and Pattern Recognition

What problem does this paper attempt to address?

The problem that this paper attempts to solve is how to protect personal data from being unauthorizedly exploited when using the Stable Diffusion model for personalized concept implantation. Specifically, the paper focuses on two main applications: preventing Stable Diffusion from learning an individual's FaceID and learning an artist's style from artworks. With the wide application of Stable Diffusion in the field of artistic creation, these applications have raised concerns about image privacy and copyright infringement. To this end, researchers have explored methods of protecting personal data by adding imperceptible adversarial perturbations to images. However, the effectiveness of these methods has not been fully verified in practical scenarios. Therefore, the paper aims to systematically evaluate the effectiveness of these protection methods in actual threat models and proposes a new purification method, GrIDPure, to remove protective perturbations while preserving the original image structure as much as possible. Experimental results show that existing protection methods may not be sufficient to effectively protect image privacy and copyright, while GrIDPure provides a more effective solution.

Can Protective Perturbation Safeguard Personal Data from Being Exploited by Stable Diffusion?

IMPRESS: Evaluating the Resilience of Imperceptible Perturbations Against Unauthorized Data Usage in Diffusion-Based Generative AI

Protective Perturbations against Unauthorized Data Usage in Diffusion-based Image Generation

Rethinking and Defending Protective Perturbation in Personalized Diffusion Models

Unlearnable Examples for Diffusion Models: Protect Data from Unauthorized Exploitation

EditShield: Protecting Unauthorized Image Editing by Instruction-guided Diffusion Models

DiffusionGuard: A Robust Defense Against Malicious Diffusion-based Image Editing

Prompt-Agnostic Adversarial Perturbation for Customized Diffusion Models

Targeted Attack Improves Protection against Unauthorized Diffusion Customization

Adversarial Robust Safeguard for Evading Deep Facial Manipulation

Toward Robust Imperceptible Perturbation against Unauthorized Text-to-image Diffusion-based Synthesis

DiffProtect: Generate Adversarial Examples with Diffusion Models for Facial Privacy Protection

Real-time Identity Defenses against Malicious Personalization of Diffusion Models

Visual-Friendly Concept Protection via Selective Adversarial Perturbations

Perturbing Attention Gives You More Bang for the Buck: Subtle Imaging Perturbations That Efficiently Fool Customized Diffusion Models

Toward effective protection against diffusion based mimicry through score distillation

Generation of Face Privacy-Protected Images Based on the Diffusion Model

Imperceptible Protection against Style Imitation from Diffusion Models

Adversarial Perturbations Cannot Reliably Protect Artists From Generative AI

DDAP: Dual-Domain Anti-Personalization against Text-to-Image Diffusion Models

Exploiting Watermark-Based Defense Mechanisms in Text-to-Image Diffusion Models for Unauthorized Data Usage