Adapting to Unknown Low-Dimensional Structures in Score-Based Diffusion Models

Gen Li,Yuling Yan
2024-05-24
Abstract:This paper investigates score-based diffusion models when the underlying target distribution is concentrated on or near low-dimensional manifolds within the higher-dimensional space in which they formally reside, a common characteristic of natural image distributions. Despite previous efforts to understand the data generation process of diffusion models, existing theoretical support remains highly suboptimal in the presence of low-dimensional structure, which we strengthen in this paper. For the popular Denoising Diffusion Probabilistic Model (DDPM), we find that the dependency of the error incurred within each denoising step on the ambient dimension $d$ is in general unavoidable. We further identify a unique design of coefficients that yields a converges rate at the order of $O(k^{2}/\sqrt{T})$ (up to log factors), where $k$ is the intrinsic dimension of the target distribution and $T$ is the number of steps. This represents the first theoretical demonstration that the DDPM sampler can adapt to unknown low-dimensional structures in the target distribution, highlighting the critical importance of coefficient design. All of this is achieved by a novel set of analysis tools that characterize the algorithmic dynamics in a more deterministic manner.
Machine Learning,Artificial Intelligence,Statistics Theory
What problem does this paper attempt to address?
This paper mainly discusses the adaptability of fractional diffusion models in handling low-dimensional structured data. Existing theoretical analyses are insufficient when low-dimensional structures are present, and this paper aims to address this limitation. The research object is the Denoising Diffusion Probability Model (DDPM), which generates high-quality new data from the target distribution through a step-by-step denoising process. The paper points out that although DDPM performs well on complex distributions such as images, audio, and text, its performance is affected by two sources of error: discretization error (due to limited number of steps) and fractional estimation error (due to inaccurate fractional estimation). Current theory suggests that to achieve a certain accuracy, the number of steps needs to be proportional to the dimension of the problem, which contradicts the number of steps required in practice (e.g., for natural image datasets). The paper proposes that the distribution of natural images tends to concentrate on low-dimensional manifolds in high-dimensional space, so a reasonable assumption is that the convergence rate of the DDPM sampler depends on the intrinsic dimension of the target distribution rather than the ambient dimension. The paper proves that by designing specific coefficients, the errors of the DDPM sampler can be measured by the total variation distance and are independent of the ambient dimension, mainly depending on the intrinsic dimension. Furthermore, the paper also demonstrates the uniqueness of this coefficient design in avoiding discretization error at each step. The main contributions of the paper include: 1. Providing an upper bound on the error of the DDPM sampler, showing that after a sufficient number of steps, the error is proportional to the square root of the intrinsic dimension k, with the relationship to the ambient dimension d only reflected in the logarithmic factor. 2. Demonstrating that the chosen coefficient design is to some extent unique and can avoid discretization error being proportional to the ambient dimension. 3. This is the first theoretical proof that the DDPM sampler can adapt to unknown low-dimensional structures in the target distribution. Through these discoveries, the paper bridges the gap between the theory and practice of the DDPM sampler and emphasizes the importance of coefficient design.