Dynamic Negative Guidance of Diffusion Models

Felix Koulischer,Johannes Deleu,Gabriel Raya,Thomas Demeester,Luca Ambrogioni
2024-10-18
Abstract:Negative Prompting (NP) is widely utilized in diffusion models, particularly in text-to-image applications, to prevent the generation of undesired features. In this paper, we show that conventional NP is limited by the assumption of a constant guidance scale, which may lead to highly suboptimal results, or even complete failure, due to the non-stationarity and state-dependence of the reverse process. Based on this analysis, we derive a principled technique called Dynamic Negative Guidance, which relies on a near-optimal time and state dependent modulation of the guidance without requiring additional training. Unlike NP, negative guidance requires estimating the posterior class probability during the denoising process, which is achieved with limited additional computational overhead by tracking the discrete Markov Chain during the generative process. We evaluate the performance of DNG class-removal on MNIST and CIFAR10, where we show that DNG leads to higher safety, preservation of class balance and image quality when compared with baseline methods. Furthermore, we show that it is possible to use DNG with Stable Diffusion to obtain more accurate and less invasive guidance than NP.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
### What problem does this paper attempt to solve? This paper aims to address the limitations of the Negative Prompting (NP) method in Diffusion Models (DMs). Specifically, traditional NP methods rely on a fixed guidance scale, which may lead to sub - optimal or even completely failed results when dealing with non - stationary and state - dependent reverse generation processes. To overcome this problem, the authors propose a new technique - Dynamic Negative Guidance (DNG). DNG improves NP by introducing an approximately optimal guidance scale that varies with time and state, thereby enhancing the quality and safety of the generated results. #### Summary of main problems: 1. **Limitations of fixed guidance scale**: Traditional NP methods use a fixed guidance scale and cannot adapt to the requirements of different stages in the generation process, resulting in poor performance in some cases. 2. **Non - stationarity and state - dependence**: The reverse generation process of diffusion models is non - stationary and state - dependent, and traditional fixed - guidance methods have difficulty dealing with these characteristics. 3. **Improving generation quality and safety**: A more flexible and effective guidance method is needed to ensure the quality of the generated content and avoid generating unwanted features. ### Characteristics of Dynamic Negative Guidance (DNG): - **Time - and state - dependent guidance scale**: DNG dynamically adjusts the guidance scale by estimating the posterior class probability, making it change according to the specific situation during the generation process. - **No additional training required**: DNG does not require additional training steps; it only needs to track the discrete Markov chain during the generation process. - **Higher safety and image quality**: Experiments show that DNG can better remove images of specific classes on the MNIST and CIFAR10 datasets while maintaining the quality and diversity of the generated images. ### Formula explanation: In the DNG method, the guidance scale is dynamic and depends on the posterior probability \( p(c^-|x) \), and its formula is as follows: \[ \nabla_x \log p_t(x|c^+) = \nabla_x \log p_t(x) - \lambda(x, t) \left( \nabla_x \log p_t(x|c^-) - \nabla_x \log p_t(x) \right) \] where: - \( \lambda(x, t) = \lambda_0 \frac{p(c^-|x)}{1 - p(c^-|x)} \) is the dynamic guidance scale. - \( p(c^-|x) \) is the posterior class probability, representing the probability that the generated content belongs to an unwanted category. In this way, DNG can adaptively adjust the guidance intensity during the generation process, thereby achieving more precise and robust negative guidance.