Elucidating The Design Space of Classifier-Guided Diffusion Generation

Jiajun Ma,Tianyang Hu,Wenjia Wang,Jiacheng Sun
2023-10-17
Abstract:Guidance in conditional diffusion generation is of great importance for sample quality and controllability. However, existing guidance schemes are to be desired. On one hand, mainstream methods such as classifier guidance and classifier-free guidance both require extra training with labeled data, which is time-consuming and unable to adapt to new conditions. On the other hand, training-free methods such as universal guidance, though more flexible, have yet to demonstrate comparable performance. In this work, through a comprehensive investigation into the design space, we show that it is possible to achieve significant performance improvements over existing guidance schemes by leveraging off-the-shelf classifiers in a training-free fashion, enjoying the best of both worlds. Employing calibration as a general guideline, we propose several pre-conditioning techniques to better exploit pretrained off-the-shelf classifiers for guiding diffusion generation. Extensive experiments on ImageNet validate our proposed method, showing that state-of-the-art diffusion models (DDPM, EDM, DiT) can be further improved (up to 20%) using off-the-shelf classifiers with barely any extra computational cost. With the proliferation of publicly available pretrained classifiers, our proposed approach has great potential and can be readily scaled up to text-to-image generation tasks. The code is available at <a class="link-external link-https" href="https://github.com/AlexMaOLS/EluCD/tree/main" rel="external noopener nofollow">this https URL</a>.
Machine Learning
What problem does this paper attempt to address?
### What problems does this paper attempt to solve? This paper aims to solve the guidance problems in Conditional Diffusion Generation. Specifically, the author focuses on how to significantly improve the quality and controllability of samples generated by diffusion models using off - the - shelf classifiers without additional training. The following are the key problems that the paper attempts to solve: 1. **Limitations of existing guidance schemes**: - **Classifier Guidance (CG) and Classifier - Free Guidance (CFG)**: Both of these mainstream methods require additional training with labeled data, which is not only time - consuming but also difficult to adapt to new conditions. - **Training - Free methods**: Training - free methods such as Universal Guidance are more flexible, but they have not yet shown performance comparable to CG and CFG in formal quantitative evaluations. 2. **Exploring the new guidance design space**: - The paper shows the possibility of achieving significant performance improvements in a training - free manner using off - the - shelf classifiers through a comprehensive study of the guidance design space. - The author proposes a series of pre - processing techniques, using calibration as a general guiding principle, to better use pre - trained off - the - shelf classifiers to guide diffusion generation. 3. **Improving generation quality and efficiency**: - Experiments on the ImageNet dataset with the proposed method verify its effectiveness, indicating that using off - the - shelf classifiers can improve the performance of state - of - the - art diffusion models (such as DDPM, EDM, DiT) by up to 20% with almost no increase in computational cost. - As the number of publicly available pre - trained classifiers continues to increase, this method has great potential and can be easily extended to text - to - image generation tasks. 4. **Flexibility and adaptability**: - The new method not only improves the generation quality but also enhances the adaptability and flexibility to various new conditions, making it more practical in real - world applications. In summary, this paper proposes an efficient, high - performance, and flexible guidance method by re - thinking the design space of classifier guidance, aiming to overcome the limitations of existing guidance schemes and provide a better solution for conditional diffusion generation.