DC-Solver: Improving Predictor-Corrector Diffusion Sampler via Dynamic Compensation

Wenliang Zhao,Haolin Wang,Jie Zhou,Jiwen Lu
2024-09-06
Abstract:Diffusion probabilistic models (DPMs) have shown remarkable performance in visual synthesis but are computationally expensive due to the need for multiple evaluations during the sampling. Recent predictor-corrector diffusion samplers have significantly reduced the required number of function evaluations (NFE), but inherently suffer from a misalignment issue caused by the extra corrector step, especially with a large classifier-free guidance scale (CFG). In this paper, we introduce a new fast DPM sampler called DC-Solver, which leverages dynamic compensation (DC) to mitigate the misalignment of the predictor-corrector samplers. The dynamic compensation is controlled by compensation ratios that are adaptive to the sampling steps and can be optimized on only 10 datapoints by pushing the sampling trajectory toward a ground truth trajectory. We further propose a cascade polynomial regression (CPR) which can instantly predict the compensation ratios on unseen sampling configurations. Additionally, we find that the proposed dynamic compensation can also serve as a plug-and-play module to boost the performance of predictor-only samplers. Extensive experiments on both unconditional sampling and conditional sampling demonstrate that our DC-Solver can consistently improve the sampling quality over previous methods on different DPMs with a wide range of resolutions up to 1024$\times$1024. Notably, we achieve 10.38 FID (NFE=5) on unconditional FFHQ and 0.394 MSE (NFE=5, CFG=7.5) on Stable-Diffusion-2.1. Code is available at <a class="link-external link-https" href="https://github.com/wl-zhao/DC-Solver" rel="external noopener nofollow">this https URL</a>
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
### Problems the paper attempts to solve This paper aims to solve the problem of fast sampling in visual generation tasks for Diffusion Probabilistic Models (DPMs). Although DPMs perform excellently in high - quality image synthesis, their sampling process is computationally expensive and requires multiple evaluations of the denoising network. Some recent studies have significantly reduced the number of required function evaluations (NFE) by introducing the predictor - corrector framework, but there still exists an alignment problem caused by the additional correction steps, especially when using a larger Classifier - Free Guidance Scale (CFG). To solve these problems, this paper proposes a new fast DPM sampler - DC - Solver (Dynamic Compensation Solver). DC - Solver alleviates the alignment problem in the predictor - corrector sampler through Dynamic Compensation (DC). Specifically, dynamic compensation optimizes the sampling trajectory to be close to the real trajectory through a compensation ratio adaptive to the sampling steps. In addition, this paper also proposes Cascade Polynomial Regression (CPR), which can instantaneously predict the compensation ratio under unseen sampling configurations. Experimental results show that DC - Solver can significantly improve the sampling quality under different DPMs and resolutions, especially in few - step sampling. ### Main contributions 1. **Dynamic Compensation (DC)**: - Introduced a dynamic compensation mechanism to alleviate the alignment problem in the predictor - corrector framework through an adaptive compensation ratio. - The dynamic compensation ratio can be optimized on a small number of data points, and the optimization process is fast and efficient. 2. **Cascade Polynomial Regression (CPR)**: - Proposed a cascade polynomial regression method that can instantaneously predict the compensation ratio under unseen sampling configurations, making DC - Solver more flexible and efficient in practical applications. 3. **Extensive experimental verification**: - Conducted a large number of experiments in unconditional and conditional sampling tasks to verify the superior performance of DC - Solver under different DPMs and resolutions. - Experimental results show that DC - Solver significantly outperforms existing methods in few - step sampling, especially in high - resolution image generation and text - to - image generation tasks. ### Formula summary - **Dynamic compensation estimation**: \[ \hat{\epsilon}_{\rho_i}(\tilde{x}_t^i, t_i) = \sum_{k = 0}^{K} \left( \prod_{\substack{0 \leq l \leq K \\ l \neq k}} \frac{t_i' - t_{i - l}}{t_{i - k} - t_{i - l}} \right) \epsilon_\theta(\tilde{x}_{t_{i - k}}, t_{i - k}) \] where \( t_i'=\rho_i t_i+(1 - \rho_i) t_{i - 1} \), \( K \) is the order of Lagrange interpolation. - **Compensation ratio optimization**: \[ \rho_i^*=\arg \min_{\rho_i} \mathbb{E} \left[ \| \tilde{x}_{t_{i + 1}}^c - x_{t_{i + 1}}^{GT} \|^2_2 \right] \] - **Cascade polynomial regression**: \[ \phi_j^{(2)} = f^{(p_1)}(NFE | \phi_j^{(1)}) \] \[ \phi_j^{(3)} = f^{(p_2)}(CFG | \phi_j^{(2)}) \] \[ \hat{\rho}_i^* = f^{(p_3)}