Abstract:In recent years, diffusion models have achieved remarkable success in the realm of high-quality image generation, garnering increased attention. This surge in interest is paralleled by a growing concern over the security threats associated with diffusion models, largely attributed to their susceptibility to malicious exploitation. Notably, recent research has brought to light the vulnerability of diffusion models to backdoor attacks, enabling the generation of specific target images through corresponding triggers. However, prevailing backdoor attack methods rely on manually crafted trigger generation functions, often manifesting as discernible patterns incorporated into input noise, thus rendering them susceptible to human detection. In this paper, we present an innovative and versatile optimization framework designed to acquire invisible triggers, enhancing the stealthiness and resilience of inserted backdoors. Our proposed framework is applicable to both unconditional and conditional diffusion models, and notably, we are the pioneers in demonstrating the backdooring of diffusion models within the context of text-guided image editing and inpainting pipelines. Moreover, we also show that the backdoors in the conditional generation can be directly applied to model watermarking for model ownership verification, which further boosts the significance of the proposed framework. Extensive experiments on various commonly used samplers and datasets verify the efficacy and stealthiness of the proposed framework. Our code is publicly available at <a class="link-external link-https" href="https://github.com/invisibleTriggerDiffusion/invisible_triggers_for_diffusion" rel="external noopener nofollow">this https URL</a>.

PureDiffusion: Using Backdoor to Counter Backdoor in Generative Diffusion Models

How to Backdoor Diffusion Models?

UIBDiffusion: Universal Imperceptible Backdoor Attack for Diffusion Models

TERD: A Unified Framework for Safeguarding Diffusion Models Against Backdoors

Diff-Cleanse: Identifying and Mitigating Backdoor Attacks in Diffusion Models

Invisible Backdoor Attacks on Diffusion Models

Elijah: Eliminating Backdoors Injected in Diffusion Models via Distribution Shift

DisDet: Exploring Detectability of Backdoor Attack on Diffusion Models

Attacks and Defenses for Generative Diffusion Models: A Comprehensive Survey

UFID: A Unified Framework for Input-level Backdoor Detection on Diffusion Models

Watch the Watcher! Backdoor Attacks on Security-Enhancing Diffusion Models

A stealthy and robust backdoor attack via frequency domain transform

Backdoor Mitigation by Correcting the Distribution of Neural Activations

The last Dance : Robust backdoor attack via diffusion models and bayesian approach

Text-to-Image Diffusion Models can be Easily Backdoored through Multimodal Data Poisoning

The Stronger the Diffusion Model, the Easier the Backdoor: Data Poisoning to Induce Copyright Breaches Without Adjusting Finetuning Pipeline

Defending Text-to-image Diffusion Models: Surprising Efficacy of Textual Perturbations Against Backdoor Attacks

Backdoor Attack in the Physical World

Toward effective protection against diffusion based mimicry through score distillation

Evolutionary Trigger Detection and Lightweight Model Repair Based Backdoor Defense