Rectified Diffusion: Straightness Is Not Your Need in Rectified Flow

Fu-Yun Wang,Ling Yang,Zhaoyang Huang,Mengdi Wang,Hongsheng Li
2024-10-12
Abstract:Diffusion models have greatly improved visual generation but are hindered by slow generation speed due to the computationally intensive nature of solving generative ODEs. Rectified flow, a widely recognized solution, improves generation speed by straightening the ODE path. Its key components include: 1) using the diffusion form of flow-matching, 2) employing $\boldsymbol v$-prediction, and 3) performing rectification (a.k.a. reflow). In this paper, we argue that the success of rectification primarily lies in using a pretrained diffusion model to obtain matched pairs of noise and samples, followed by retraining with these matched noise-sample pairs. Based on this, components 1) and 2) are unnecessary. Furthermore, we highlight that straightness is not an essential training target for rectification; rather, it is a specific case of flow-matching models. The more critical training target is to achieve a first-order approximate ODE path, which is inherently curved for models like DDPM and Sub-VP. Building on this insight, we propose Rectified Diffusion, which generalizes the design space and application scope of rectification to encompass the broader category of diffusion models, rather than being restricted to flow-matching models. We validate our method on Stable Diffusion v1-5 and Stable Diffusion XL. Our method not only greatly simplifies the training procedure of rectified flow-based previous works (e.g., InstaFlow) but also achieves superior performance with even lower training cost. Our code is available at <a class="link-external link-https" href="https://github.com/G-U-N/Rectified-Diffusion" rel="external noopener nofollow">this https URL</a>.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The paper attempts to address the issue of slow generation speed in generative diffusion models (such as image generation) due to the computational intensity of solving the generative ordinary differential equations (ODEs). Specifically, one of the existing solutions is "Rectified Flow," which improves generation speed by making the generation path more linear. However, the authors of this paper argue that the core of the Rectified Flow method is not in the linearization of the path, but in using a pre-trained diffusion model to obtain matched noise-sample pairs, and then retraining the model with these paired data. The main contributions of the paper include: 1. **Theoretical Analysis**: The authors point out that the key to the Rectified Flow method is not the linearization of the path, but the use of a pre-trained model to obtain matched noise-sample pairs, followed by retraining. Additionally, they propose that the first-order approximation of the ODE path is the critical training objective, rather than the linearization of the path. 2. **Method Improvement**: Based on this understanding, the authors propose "Rectified Diffusion," a more general method that can be applied to various diffusion models, not just flow matching models. 3. **Empirical Validation**: Through experiments on Stable Diffusion v1-5 and Stable Diffusion XL, the authors demonstrate that their method not only simplifies the training process but also outperforms the previous Rectified Flow method in terms of performance. Overall, this paper aims to propose a more general and efficient solution to improve the generation speed and quality of diffusion models by gaining a deeper understanding of the essence of Rectified Flow.