Abstract:Conditional diffusion models serve as the foundation of modern image synthesis and find extensive application in fields like computational biology and reinforcement learning. In these applications, conditional diffusion models incorporate various conditional information, such as prompt input, to guide the sample generation towards desired properties. Despite the empirical success, theory of conditional diffusion models is largely missing. This paper bridges this gap by presenting a sharp statistical theory of distribution estimation using conditional diffusion models. Our analysis yields a sample complexity bound that adapts to the smoothness of the data distribution and matches the minimax lower bound. The key to our theoretical development lies in an approximation result for the conditional score function, which relies on a novel diffused Taylor approximation technique. Moreover, we demonstrate the utility of our statistical theory in elucidating the performance of conditional diffusion models across diverse applications, including model-based transition kernel estimation in reinforcement learning, solving inverse problems, and reward conditioned sample generation.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is the theoretical deficiency of Conditional Diffusion Models (CDMs), especially when trained with classifier - free guidance. Although conditional diffusion models have achieved remarkable empirical success in fields such as image synthesis, computational biology, and reinforcement learning, their theoretical foundation has not been fully developed. Specifically, this paper aims to answer the following two core questions: 1. **How do conditional diffusion models estimate the conditional score function in the case of classifier - free guidance?** 2. **What are the corresponding statistical rates of conditional distribution estimation?** ### Main contributions of the paper To answer the above questions, the authors established the first general approximation theory for conditional diffusion models and made the following contributions: 1. **Approximation theory of the conditional score function**: - The authors proved the first general approximation theory (Theorem 3.2) for approximating the conditional score function using ReLU neural networks. To achieve the required L2 error, they showed the adaptive relationship between the network size and the smoothness of the data distribution. - Under Assumption 3.3, an improved approximation result (Theorem 3.4) was further established, where the approximation error is \(O\left(\frac{B^2}{\sigma_t^2} \cdot N^{-\frac{2\beta}{d + d_y}} \cdot (\log N)^{\beta + 1}\right)\). 2. **Distribution estimation theory**: - The authors studied distribution estimation using conditional diffusion models and provided sample complexity bounds (Theorem 4.2). The conditional score estimation result (Theorem 4.1) was connected to the distribution estimation theory through the Girsanov theorem. - Their statistical rates match the minimax lower bounds (Proposition 4.3), and for the first time, they provided statistical guarantees for conditional diffusion models in model - based reinforcement learning (Proposition 4.5). 3. **Extended applications**: - The authors also established the theoretical basis for conditional diffusion models in solving inverse problems and reward - conditional sample generation, demonstrating the practicality of the established statistical theory. - Specifically, they provided sub - optimality bounds when generating high - reward samples in the offline setting (Proposition 5.2), and gave error bounds for posterior mean estimation in linear inverse problems (Proposition 5.4). ### Summary This paper fills the theoretical gap in conditional diffusion models, especially in the theoretical analysis under the classifier - free guidance method. By establishing the approximation theory of the conditional score function and the distribution estimation theory, the authors not only explain the success of conditional diffusion models but also provide solid theoretical support for their applications in different tasks.

Unveil Conditional Diffusion Models with Classifier-free Guidance: A Sharp Statistical Theory

Theoretical Insights for Diffusion Guidance: A Case Study for Gaussian Mixture Models

Classifier-Free Diffusion Guidance

Conditional Diffusion with Less Explicit Guidance via Model Predictive Control

An Overview of Diffusion Models: Applications, Guided Generation, Statistical Rates and Optimization

A Simple Approach to Unifying Diffusion-based Conditional Generation

Guidance with Spherical Gaussian Constraint for Conditional Diffusion

Exploring Guided Sampling of Conditional GANs

Elucidating The Design Space of Classifier-Guided Diffusion Generation

TFG: Unified Training-Free Guidance for Diffusion Models

Rectified Diffusion Guidance for Conditional Generation

Conditional Image Synthesis with Diffusion Models: A Survey

Universal Guidance for Diffusion Models

Conditional Diffusion Models are Minimax-Optimal and Manifold-Adaptive for Conditional Distribution Estimation

Reward-Directed Conditional Diffusion: Provable Distribution Estimation and Reward Improvement

Inner Classifier-Free Guidance and Its Taylor Expansion for Diffusion Models

Simple Guidance Mechanisms for Discrete Diffusion Models

On the Generalization Properties of Diffusion Models

Your Diffusion Model is Secretly a Zero-Shot Classifier

Theory on Score-Mismatched Diffusion Models and Zero-Shot Conditional Samplers

Conditional sampling within generative diffusion models