Abstract:The diffusion probabilistic generative models are widely used to generate high-quality data. Though they can synthetic data that does not exist in the training set, the rationale behind such generalization is still unexplored. In this paper, we formally define the generalization of the generative model, which is measured by the mutual information between the generated data and the training set. The definition originates from the intuition that the model which generates data with less correlation to the training set exhibits better generalization ability. Meanwhile, we show that for the empirical optimal diffusion model, the data generated by a deterministic sampler are all highly related to the training set, thus poor generalization. This result contradicts the observation of the trained diffusion model's (approximating empirical optima) extrapolation ability (generating unseen data). To understand this contradiction, we empirically verify the difference between the sufficiently trained diffusion model and the empirical optima. We found, though obtained through sufficient training, there still exists a slight difference between them, which is critical to making the diffusion model generalizable. Moreover, we propose another training objective whose empirical optimal solution has no potential generalization problem. We empirically show that the proposed training objective returns a similar model to the original one, which further verifies the generalization ability of the trained diffusion model.

What problem does this paper attempt to address?

### Problems Addressed by the Paper The paper primarily explores the generalization ability of Diffusion Models in generating high-quality data and attempts to solve the following issues: 1. **Defining the Generalization Ability of Generative Models**: - The authors formally define the generalization ability of generative models, measured by the mutual information between the generated data and the training set. - The goal is for the generated data to have a low correlation with the training set to demonstrate better generalization ability. 2. **Analyzing the Generalization Problem of Diffusion Models**: - The data generated by existing optimal diffusion models is highly correlated with the training set, leading to poor generalization performance. - This phenomenon contradicts the observed ability of diffusion models to generate new data. 3. **Methods to Solve the Generalization Problem**: - A new training objective function is proposed to reduce the correlation between the generated data and the training set. - Experimental validation shows that the new method can avoid potential generalization problems while maintaining the quality of the generated data. ### Main Findings - **Theoretical Analysis**: Theoretically proves that the data generated by existing optimal diffusion models is highly correlated with the training set, leading to poor generalization performance. - **Role of Optimization Bias**: Experiments find that models trained sufficiently can avoid this high correlation due to biases in the optimization process, thereby improving generalization ability. - **New Training Objective**: A new training objective function is proposed, which can effectively avoid the high correlation between the generated data and the training set, further improving the model's generalization ability.

On the Generalization of Diffusion Model

On the Generalization Properties of Diffusion Models

The Emergence of Reproducibility and Generalizability in Diffusion Models

Understanding Generalizability of Diffusion Models Requires Rethinking the Hidden Gaussian Structure

From memorization to generalization: a theoretical framework for diffusion-based generative models

An Overview of Diffusion Models: Applications, Guided Generation, Statistical Rates and Optimization

Where to Diffuse, How to Diffuse, and How to Get Back: Automated Learning for Multivariate Diffusions

A Survey on Generative Diffusion Model

A Survey on Generative Diffusion Models

Exploring the Optimal Choice for Generative Processes in Diffusion Models: Ordinary vs Stochastic Differential Equations

Towards Theoretical Understandings of Self-Consuming Generative Models

Towards a Mechanistic Explanation of Diffusion Model Generalization

Towards Faster Non-Asymptotic Convergence for Diffusion-Based Generative Models

Transfer Learning for Diffusion Models

Diffusion Model for Data-Driven Black-Box Optimization

Diffusion Models: A Comprehensive Survey of Methods and Applications

An optimal control perspective on diffusion-based generative modeling

Conditional Diffusion Models are Minimax-Optimal and Manifold-Adaptive for Conditional Distribution Estimation

Theoretical research on generative diffusion models: an overview

Generalized Diffusion Model with Adjusted Offset Noise