A Simple Early Exiting Framework for Accelerated Sampling in Diffusion Models

Taehong Moon,Moonseok Choi,EungGu Yun,Jongmin Yoon,Gayoung Lee,Jaewoong Cho,Juho Lee
2024-08-12
Abstract:Diffusion models have shown remarkable performance in generation problems over various domains including images, videos, text, and audio. A practical bottleneck of diffusion models is their sampling speed, due to the repeated evaluation of score estimation networks during the inference. In this work, we propose a novel framework capable of adaptively allocating compute required for the score estimation, thereby reducing the overall sampling time of diffusion models. We observe that the amount of computation required for the score estimation may vary along the time step for which the score is estimated. Based on this observation, we propose an early-exiting scheme, where we skip the subset of parameters in the score estimation network during the inference, based on a time-dependent exit schedule. Using the diffusion models for image synthesis, we show that our method could significantly improve the sampling throughput of the diffusion models without compromising image quality. Furthermore, we also demonstrate that our method seamlessly integrates with various types of solvers for faster sampling, capitalizing on their compatibility to enhance overall efficiency. The source code and our experiments are available at \url{<a class="link-external link-https" href="https://github.com/taehong-moon/ee-diffusion" rel="external noopener nofollow">this https URL</a>}
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
### What problem does this paper attempt to solve? This paper aims to solve the problem of slow sampling speed in generation tasks of diffusion models (Diffusion Models). Specifically: 1. **Bottlenecks of diffusion models**: - Diffusion models perform well in generation tasks in various fields such as images, videos, texts, and audios, but one of the main bottlenecks in their practical applications is the slow sampling speed. - This is because in the inference process, the score estimation networks need to be repeatedly evaluated, resulting in a large amount of calculation and long time consumption. 2. **Deficiencies of existing methods**: - At present, there are various methods that attempt to accelerate the sampling process of diffusion models, for example, by improving ODE/SDE solvers or distilling into models that require fewer sampling steps. - However, these methods usually focus on reducing the number of sampling steps rather than directly reducing the amount of calculation at each time step. 3. **The proposed new framework**: - The paper proposes a new framework - Adaptive Score Estimation (ASE), which reduces the overall sampling time by dynamically allocating computing resources. - Based on the observation that the amount of calculation required for score estimation varies at different time steps, this framework proposes an early - exiting scheme, that is, skipping some parameters according to a time - dependent exit plan during the inference process. 4. **Objectives and advantages**: - Through this method, the paper shows that ASE can significantly improve the sampling throughput of diffusion models without sacrificing image quality. - In addition, ASE can also be seamlessly integrated with other types of solvers to further improve the sampling speed. ### Formula explanations - **Forward process of diffusion models**: \[ q(x_t | x_{t - 1})=\mathcal{N}(x_t|\sqrt{1 - \beta_t}x_{t - 1},\beta_tI) \] where \(\beta_t\) is the noise scheduling parameter and \(I\) is the identity matrix. - **Backward diffusion process**: \[ p_\theta(x_{1:T}) = p(x_T)\prod_{t = 1}^T p_\theta(x_{t - 1}|x_t) \] where \(p(x_T)\) is the standard Gaussian distribution. - **Loss function**: \[ L(\theta)=-\sum_{t = 1}^T\mathbb{E}_q\left[D_{KL}[q(x_{t - 1}|x_t,x_0)\|p_\theta(x_{t - 1}|x_t)]\right] \] - **Score estimation network**: \[ s_\theta(x_t,t)=-\frac{\epsilon_t}{\sqrt{1 - \bar{\alpha}_t}} \] where \(\epsilon_t\) is the noise estimate and \(\bar{\alpha}_t\) is the cumulative noise parameter. Through these formulas, the paper describes in detail the working principle of diffusion models and proposes a new method for accelerating sampling on this basis.