CutDiffusion: A Simple, Fast, Cheap, and Strong Diffusion Extrapolation Method

Mingbao Lin,Zhihang Lin,Wengyi Zhan,Liujuan Cao,Rongrong Ji

2024-04-23

Abstract:Transforming large pre-trained low-resolution diffusion models to cater to higher-resolution demands, i.e., diffusion extrapolation, significantly improves diffusion adaptability. We propose tuning-free CutDiffusion, aimed at simplifying and accelerating the diffusion extrapolation process, making it more affordable and improving performance. CutDiffusion abides by the existing patch-wise extrapolation but cuts a standard patch diffusion process into an initial phase focused on comprehensive structure denoising and a subsequent phase dedicated to specific detail refinement. Comprehensive experiments highlight the numerous almighty advantages of CutDiffusion: (1) simple method construction that enables a concise higher-resolution diffusion process without third-party engagement; (2) fast inference speed achieved through a single-step higher-resolution diffusion process, and fewer inference patches required; (3) cheap GPU cost resulting from patch-wise inference and fewer patches during the comprehensive structure denoising; (4) strong generation performance, stemming from the emphasis on specific detail refinement.

Computer Vision and Pattern Recognition,Artificial Intelligence

What problem does this paper attempt to address?

This paper proposes a new method called CutDiffusion, aiming to simplify and accelerate the high-resolution extension process of the Diffusion Model, making it more cost-effective and improving the generation performance. The Diffusion Model is commonly used to generate detailed images from textual descriptions, but its adaptability is poor when higher resolution is required. CutDiffusion addresses this issue by dividing the standard patch diffusion process into two stages, first focusing on comprehensive structural denoising and then refining specific details. The advantages of CutDiffusion are as follows: 1. Simple construction method: No third-party involvement is required, achieving a concise high-resolution diffusion process. 2. Fast inference speed: Achieved through a single high-resolution diffusion process, reducing the number of required inference patches. 3. Cost-effective GPU cost: By adopting patch-wise inference and fewer patches for structural denoising, the GPU cost is reduced. 4. Powerful generation performance: Emphasizing the fine processing of specific details, improving the generation quality. Compared with existing methods, CutDiffusion achieves faster inference speed and lower GPU cost while maintaining high-quality image generation, without changing the parameters. The paper also demonstrates the comparisons between CutDiffusion and other methods, proving its advantages in different application scenarios.

CutDiffusion: A Simple, Fast, Cheap, and Strong Diffusion Extrapolation Method

AccDiffusion v2: Towards More Accurate Higher-Resolution Diffusion Extrapolation

Patch Diffusion: Faster and More Data-Efficient Training of Diffusion Models

DistriFusion: Distributed Parallel Inference for High-Resolution Diffusion Models

PFDiff: Training-free Acceleration of Diffusion Models through the Gradient Guidance of Past and Future

Partially Conditioned Patch Parallelism for Accelerated Diffusion Model Inference

FRDiff : Feature Reuse for Universal Training-free Acceleration of Diffusion Models

PatchScaler: An Efficient Patch-Independent Diffusion Model for Image Super-Resolution

AutoDiffusion: Training-Free Optimization of Time Steps and Architectures for Automated Diffusion Model Acceleration

Training-Free Adaptive Diffusion with Bounded Difference Approximation Strategy

SparseDM: Toward Sparse Efficient Diffusion Models

DeepCache: Accelerating Diffusion Models for Free

AsyncDiff: Parallelizing Diffusion Models by Asynchronous Denoising

Plug-and-Play Diffusion Distillation

Flexiffusion: Segment-wise Neural Architecture Search for Flexible Denoising Schedule

Simple and Fast Distillation of Diffusion Models

Differential Diffusion: Giving Each Pixel Its Strength

SkipDiff: Adaptive Skip Diffusion Model for High-Fidelity Perceptual Image Super-resolution

Reverse Transition Kernel: A Flexible Framework to Accelerate Diffusion Inference

DuoDiff: Accelerating Diffusion Models with a Dual-Backbone Approach

AdaDiff: Accelerating Diffusion Models through Step-Wise Adaptive Computation