Abstract:The fabrication of visual misinformation on the web and social media has increased exponentially with the advent of foundational text-to-image diffusion models. Namely, Stable Diffusion inpainters allow the synthesis of maliciously inpainted images of personal and private figures, and copyrighted contents, also known as deepfakes. To combat such generations, a disruption framework, namely Photoguard, has been proposed, where it adds adversarial noise to the context image to disrupt their inpainting synthesis. While their framework suggested a diffusion-friendly approach, the disruption is not sufficiently strong and it requires a significant amount of GPU and time to immunize the context image. In our work, we re-examine both the minimal and favorable conditions for a successful inpainting disruption, proposing DDD, a "Digression guided Diffusion Disruption" framework. First, we identify the most adversarially vulnerable diffusion timestep range with respect to the hidden space. Within this scope of noised manifold, we pose the problem as a semantic digression optimization. We maximize the distance between the inpainting instance's hidden states and a semantic-aware hidden state centroid, calibrated both by Monte Carlo sampling of hidden states and a discretely projected optimization in the token space. Effectively, our approach achieves stronger disruption and a higher success rate than Photoguard while lowering the GPU memory requirement, and speeding the optimization up to three times faster.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is: how to effectively combat malicious image editing based on diffusion models, especially by optimizing the context image to prevent unauthorized image editing. Specifically, the paper proposes a framework named DDD (Digression guided Diffusion Disruption) to address the abuse problem of diffusion models such as Stable Diffusion in inpainting. ### Problem Background With the progress of deep - learning technology, especially the development of text - to - image generation models (such as Stable Diffusion), malicious users can use these models to generate false content (deepfakes), including unauthorized personal or private image editing, abuse of copyrighted content, etc. This not only causes social chaos and misinformation but also raises ethical issues. To solve this problem, researchers have proposed various methods to disrupt these generation models so that they cannot generate effective false content. ### Specific Problems 1. **Limitations of Existing Methods**: - **Photoguard** is an existing adversarial method. It disrupts the image inpainting process of the diffusion model by adding adversarial noise to the context image. However, Photoguard has problems such as high computational cost (requiring a large amount of GPU memory and time) and unstable robustness to different images and prompts. 2. **New Challenges**: - The generation process of diffusion models is progressive and iterative, which is different from traditional GAN models. This makes it difficult for previous adversarial methods to be directly applied. - How to find the most vulnerable timestep and introduce adversarial noise at this timestep to maximize the interference effect. ### The Paper's Solutions To overcome the above problems, the paper proposes the following innovations: 1. **Identifying the Most Vulnerable Timestep**: - Research has found that the early timesteps have a greater impact on the overall spatial structure and global semantics of the image. Therefore, the paper chooses to introduce adversarial noise in the early timesteps to achieve global destruction. 2. **Semantic Deviation Optimization**: - Through Monte Carlo sampling and discrete projection optimization, a semantic - aware hidden - state center point is constructed. Then, the distance between the hidden state of the context image and this center point is maximized, thereby achieving semantic deviation. 3. **Efficient Optimization Framework**: - The DDD framework significantly reduces GPU memory usage and running time while maintaining an effective interference level. Specifically, DDD is 3 times faster than Photoguard and requires less GPU memory. ### Summary The core problem of the paper is to effectively combat malicious image editing based on diffusion models by optimizing the context image. The DDD framework achieves stronger interference effects and lower computational costs through identifying the most vulnerable timestep, semantic deviation optimization, and an efficient optimization framework.

Disrupting Diffusion-based Inpainters with Semantic Digression

DiffusionGuard: A Robust Defense Against Malicious Diffusion-based Image Editing

Disrupting Diffusion: Token-Level Attention Erasure Attack against Diffusion-based Customization

Pixel Is Not A Barrier: An Effective Evasion Attack for Pixel-Domain Diffusion Models

Anti-forensics of Diffusion-Based Image Inpainting.

IMPRESS: Evaluating the Resilience of Imperceptible Perturbations Against Unauthorized Data Usage in Diffusion-Based Generative AI

Toward Robust Imperceptible Perturbation against Unauthorized Text-to-image Diffusion-based Synthesis

Prompt-Agnostic Adversarial Perturbation for Customized Diffusion Models

StealthDiffusion: Towards Evading Diffusion Forensic Detection through Diffusion Model

Towards Coherent Image Inpainting Using Denoising Diffusion Implicit Models

Coexistence of Deepfake Defenses: Addressing the Poisoning Challenge

Raising the Cost of Malicious AI-Powered Image Editing

Concept Arithmetics for Circumventing Concept Inhibition in Diffusion Models

Coherent and Multi-modality Image Inpainting via Latent Space Optimization

Invisible Backdoor Attacks on Diffusion Models

The Stronger the Diffusion Model, the Easier the Backdoor: Data Poisoning to Induce Copyright Breaches Without Adjusting Finetuning Pipeline

Structure Matters: Tackling the Semantic Discrepancy in Diffusion Models for Image Inpainting

DDAP: Dual-Domain Anti-Personalization against Text-to-Image Diffusion Models

Adv-Diffusion: Imperceptible Adversarial Face Identity Attack via Latent Diffusion Model

TD-Paint: Faster Diffusion Inpainting Through Time Aware Pixel Conditioning

Disrupting Deepfakes: Adversarial Attacks Against Conditional Image Translation Networks and Facial Manipulation Systems