Abstract:This paper investigates score-based diffusion models when the underlying target distribution is concentrated on or near low-dimensional manifolds within the higher-dimensional space in which they formally reside, a common characteristic of natural image distributions. Despite previous efforts to understand the data generation process of diffusion models, existing theoretical support remains highly suboptimal in the presence of low-dimensional structure, which we strengthen in this paper. For the popular Denoising Diffusion Probabilistic Model (DDPM), we find that the dependency of the error incurred within each denoising step on the ambient dimension $d$ is in general unavoidable. We further identify a unique design of coefficients that yields a converges rate at the order of $O(k^{2}/\sqrt{T})$ (up to log factors), where $k$ is the intrinsic dimension of the target distribution and $T$ is the number of steps. This represents the first theoretical demonstration that the DDPM sampler can adapt to unknown low-dimensional structures in the target distribution, highlighting the critical importance of coefficient design. All of this is achieved by a novel set of analysis tools that characterize the algorithmic dynamics in a more deterministic manner.

What problem does this paper attempt to address?

This paper mainly discusses the adaptability of fractional diffusion models in handling low-dimensional structured data. Existing theoretical analyses are insufficient when low-dimensional structures are present, and this paper aims to address this limitation. The research object is the Denoising Diffusion Probability Model (DDPM), which generates high-quality new data from the target distribution through a step-by-step denoising process. The paper points out that although DDPM performs well on complex distributions such as images, audio, and text, its performance is affected by two sources of error: discretization error (due to limited number of steps) and fractional estimation error (due to inaccurate fractional estimation). Current theory suggests that to achieve a certain accuracy, the number of steps needs to be proportional to the dimension of the problem, which contradicts the number of steps required in practice (e.g., for natural image datasets). The paper proposes that the distribution of natural images tends to concentrate on low-dimensional manifolds in high-dimensional space, so a reasonable assumption is that the convergence rate of the DDPM sampler depends on the intrinsic dimension of the target distribution rather than the ambient dimension. The paper proves that by designing specific coefficients, the errors of the DDPM sampler can be measured by the total variation distance and are independent of the ambient dimension, mainly depending on the intrinsic dimension. Furthermore, the paper also demonstrates the uniqueness of this coefficient design in avoiding discretization error at each step. The main contributions of the paper include: 1. Providing an upper bound on the error of the DDPM sampler, showing that after a sufficient number of steps, the error is proportional to the square root of the intrinsic dimension k, with the relationship to the ambient dimension d only reflected in the logarithmic factor. 2. Demonstrating that the chosen coefficient design is to some extent unique and can avoid discretization error being proportional to the ambient dimension. 3. This is the first theoretical proof that the DDPM sampler can adapt to unknown low-dimensional structures in the target distribution. Through these discoveries, the paper bridges the gap between the theory and practice of the DDPM sampler and emphasizes the importance of coefficient design.

Adapting to Unknown Low-Dimensional Structures in Score-Based Diffusion Models

Score Approximation, Estimation and Distribution Recovery of Diffusion Models on Low-Dimensional Data

Denoising diffusion probabilistic models are optimally adaptive to unknown low dimensionality

What's the score? Automated Denoising Score Matching for Nonlinear Diffusions

Sampling is as easy as learning the score: theory for diffusion models with minimal data assumptions

Accelerating Convergence of Score-Based Diffusion Models, Provably

From optimal score matching to optimal sampling

Unified Convergence Analysis for Score-Based Diffusion Models with Deterministic Samplers

$O(d/T)$ Convergence Theory for Diffusion Probabilistic Models under Minimal Assumptions

Towards Faster Non-Asymptotic Convergence for Diffusion-Based Generative Models

Convergence of Score-Based Discrete Diffusion Models: A Discrete-Time Analysis

Evaluating the design space of diffusion-based generative models

Theory on Score-Mismatched Diffusion Models and Zero-Shot Conditional Samplers

Efficient Denoising using Score Embedding in Score-based Diffusion Models

Score-Optimal Diffusion Schedules

To smooth a cloud or to pin it down: Guarantees and Insights on Score Matching in Denoising Diffusion Models

Multilevel Diffusion: Infinite Dimensional Score-Based Diffusion Models for Image Generation

Improved Convergence Rate for Diffusion Probabilistic Models

gDDIM: Generalized denoising diffusion implicit models

Convergence of Diffusion Models Under the Manifold Hypothesis in High-Dimensions