Patch Diffusion: Faster and More Data-Efficient Training of Diffusion Models

Zhendong Wang,Yifan Jiang,Huangjie Zheng,Peihao Wang,Pengcheng He,Zhangyang Wang,Weizhu Chen,Mingyuan Zhou

2023-10-19

Abstract:Diffusion models are powerful, but they require a lot of time and data to train. We propose Patch Diffusion, a generic patch-wise training framework, to significantly reduce the training time costs while improving data efficiency, which thus helps democratize diffusion model training to broader users. At the core of our innovations is a new conditional score function at the patch level, where the patch location in the original image is included as additional coordinate channels, while the patch size is randomized and diversified throughout training to encode the cross-region dependency at multiple scales. Sampling with our method is as easy as in the original diffusion model. Through Patch Diffusion, we could achieve $\mathbf{\ge 2\times}$ faster training, while maintaining comparable or better generation quality. Patch Diffusion meanwhile improves the performance of diffusion models trained on relatively small datasets, $e.g.$, as few as 5,000 images to train from scratch. We achieve outstanding FID scores in line with state-of-the-art benchmarks: 1.77 on CelebA-64$\times$64, 1.93 on AFHQv2-Wild-64$\times$64, and 2.72 on ImageNet-256$\times$256. We share our code and pre-trained models at <a class="link-external link-https" href="https://github.com/Zhendong-Wang/Patch-Diffusion" rel="external noopener nofollow">this https URL</a>.

Computer Vision and Pattern Recognition,Machine Learning

What problem does this paper attempt to address?

The main problem this paper attempts to address is the long training time and large data requirements of diffusion models. Specifically, while diffusion models excel at generating high-quality images, their training process is very slow and requires a significant amount of time and data resources. This not only limits the widespread application of the models but also makes it difficult for many researchers to participate in this field due to a lack of sufficient computational resources. To tackle these issues, the paper proposes a new method called "Patch Diffusion," which is a diffusion model training framework based on image patches. This method significantly reduces the computational burden of each iteration by performing conditional score matching on small patches of the image, thereby speeding up the training process and improving data efficiency. Additionally, Patch Diffusion introduces new strategies such as randomization and diversification of patch sizes and pixel coordinate systems to balance training efficiency and the effectiveness of global structure encoding. Through these innovations, Patch Diffusion not only achieves faster training speeds (at least 2 times faster) compared to traditional methods but also trains diffusion models with better performance on small datasets. For example, when training from scratch on a dataset with only 5,000 images, Patch Diffusion can generate results that are significantly better than those of other methods. These improvements are of great significance for promoting the popularization and application of diffusion model technology.

Patch Diffusion: Faster and More Data-Efficient Training of Diffusion Models

PatchScaler: An Efficient Patch-Independent Diffusion Model for Image Super-Resolution

Masked Diffusion Models Are Fast Distribution Learners

AccDiffusion v2: Towards More Accurate Higher-Resolution Diffusion Extrapolation

CutDiffusion: A Simple, Fast, Cheap, and Strong Diffusion Extrapolation Method

DIFFender: Diffusion-Based Adversarial Defense against Patch Attacks

PFDiff: Training-free Acceleration of Diffusion Models through the Gradient Guidance of Past and Future

DiffPatch: Generating Customizable Adversarial Patches using Diffusion Model

Immiscible Diffusion: Accelerating Diffusion Training with Noise Assignment

AutoDiffusion: Training-Free Optimization of Time Steps and Architectures for Automated Diffusion Model Acceleration

Patch Diffusion: A General Module for Face Manipulation Detection

Patched Denoising Diffusion Models For High-Resolution Image Synthesis

DistriFusion: Distributed Parallel Inference for High-Resolution Diffusion Models

Partially Conditioned Patch Parallelism for Accelerated Diffusion Model Inference

Pruning then Reweighting: Towards Data-Efficient Training of Diffusion Models

Real-world Adversarial Defense against Patch Attacks based on Diffusion Model

Relational Diffusion Distillation for Efficient Image Generation

Stretching Each Dollar: Diffusion Training from Scratch on a Micro-Budget

DiffPAD: Denoising Diffusion-based Adversarial Patch Decontamination

Diffusion Model Patching via Mixture-of-Prompts

Distribution-Aware Data Expansion with Diffusion Models