Provable Statistical Rates for Consistency Diffusion Models

Zehao Dou,Minshuo Chen,Mengdi Wang,Zhuoran Yang
2024-06-24
Abstract:Diffusion models have revolutionized various application domains, including computer vision and audio generation. Despite the state-of-the-art performance, diffusion models are known for their slow sample generation due to the extensive number of steps involved. In response, consistency models have been developed to merge multiple steps in the sampling process, thereby significantly boosting the speed of sample generation without compromising quality. This paper contributes towards the first statistical theory for consistency models, formulating their training as a distribution discrepancy minimization problem. Our analysis yields statistical estimation rates based on the Wasserstein distance for consistency models, matching those of vanilla diffusion models. Additionally, our results encompass the training of consistency models through both distillation and isolation methods, demystifying their underlying advantage.
Machine Learning
What problem does this paper attempt to address?
The main problem this paper attempts to address is: **What is the statistical error rate of consistency models in estimating data distribution? How does it compare to traditional diffusion models?** Specifically, the paper aims to reveal the effectiveness of consistency models from the perspective of statistical estimation, particularly how they significantly improve sampling speed while maintaining the quality of generated samples. By analyzing the training process of consistency models based on the Wasserstein distance, the paper establishes a theoretical framework for statistical estimation and presents the following main contributions: 1. **Formulating the training of consistency models as a Wasserstein distance minimization problem**: This is the first systematic description of the training objective of consistency models, covering the training methods commonly used in practice. 2. **Establishing statistical distribution estimation guarantees for consistency models trained via distillation methods**: It is proven that the distribution estimation error is primarily determined by the score estimation error, indicating that consistency models retain the distribution estimation capability of traditional diffusion models while improving sampling efficiency. 3. **Extending the study to isolation methods**: Similar statistical estimation results are established without pre-trained score functions, achieving a statistical error rate of \( \mathcal{O}(n^{-1/d}) \). These results provide a solid theoretical foundation for consistency models, explaining their success in practice.