Characteristic Guidance: Non-linear Correction for Diffusion Model at Large Guidance Scale

Candi Zheng,Yuan Lan
2024-06-03
Abstract:Popular guidance for denoising diffusion probabilistic model (DDPM) linearly combines distinct conditional models together to provide enhanced control over samples. However, this approach overlooks nonlinear effects that become significant when guidance scale is large. To address this issue, we propose characteristic guidance, a guidance method that provides first-principle non-linear correction for classifier-free guidance. Such correction forces the guided DDPMs to respect the Fokker-Planck (FP) equation of diffusion process, in a way that is training-free and compatible with existing sampling methods. Experiments show that characteristic guidance enhances semantic characteristics of prompts and mitigate irregularities in image generation, proving effective in diverse applications ranging from simulating magnet phase transitions to latent space sampling.
Computer Vision and Pattern Recognition,Artificial Intelligence,Machine Learning,Data Analysis, Statistics and Probability
What problem does this paper attempt to address?
### What problem does this paper attempt to solve? This paper aims to solve the problem that the nonlinear effects of the classifier - free guidance method in the Denoising Diffusion Probability Model (DDPM) are ignored under a large guidance scale. Specifically: 1. **Limitations of classifier - free guidance**: - Classifier - free guidance enhances the control of sample generation by linearly combining the conditional model and the unconditional model. - However, when the guidance scale is large, this method ignores significant nonlinear effects, resulting in a decline in the quality of generated images, such as abnormal color saturation and unnaturalness. 2. **Deviation of the Fokker - Planck equation**: - Classifier - free guidance deviates from the Fokker - Planck (FP) equation of the diffusion process under a large guidance scale, which breaks the equivalence between the forward and backward diffusion processes, thus resulting in irregularities in sample generation. 3. **Proposing the characteristic guidance method**: - To solve the above problems, the authors propose characteristic guidance, which is a first - principle - based nonlinear correction method for classifier - free guidance. - Characteristic guidance forces the guided DDPM to obey the FP equation without additional training and is compatible with existing sampling methods. 4. **Experimental verification**: - The experimental results show that characteristic guidance can enhance the semantic characteristics of generated images, reduce irregularities under a large guidance scale, and is suitable for various applications from simulating magnetic phase transitions to sampling in the latent space. ### Formula summary - **Forward process of the Denoising Diffusion Probability Model**: \[ x_i=\sqrt{\bar{\alpha}_i}x_0+\sqrt{1 - \bar{\alpha}_i}\bar{\epsilon}_i,\quad1\leq i\leq n \] where \(\bar{\alpha}_i\) is the contamination weight at time \(t_i\in[0, T]\), and \(\bar{\epsilon}_i\) is standard Gaussian noise. - **Formula for classifier - free guidance**: \[ \epsilon_{\text{CF}}(x|c, t_i,\omega)=(1 + \omega)\epsilon_\theta(x|c, t_i)-\omega\epsilon_\theta(x, t_i) \] - **Formula for characteristic guidance**: \[ \epsilon_{\text{CH}}(x|c, t_i,\omega)=(1 + \omega)\epsilon_\theta(x_1|c, t_i)-\omega\epsilon_\theta(x_2, t_i) \] where \(x_1 = x+\omega\Delta x\), \(x_2 = x+(1 + \omega)\Delta x\), and \(\Delta x\) is a nonlinear correction term. - **Mixed error formula**: \[ e_m(\epsilon, x, t)=\frac{\partial\epsilon}{\partial t}-\frac{1}{2}\left(L_\epsilon-\frac{1}{\sigma(t)}\nabla_x\|\epsilon\|^2_2\right) \] Through these improvements, the characteristic guidance method can effectively improve the quality and consistency of generated samples under a large guidance scale.