Abstract:In this paper we identify the source of a singularity in the training loss of key denoising models, that causes the denoiser's predictions to collapse towards the mean of the source or target distributions. This degeneracy creates false basins of attraction, distorting the denoising trajectories and ultimately increasing the number of steps required to sample these models. We circumvent this artifact by leveraging the deterministic ODE-based samplers, offered by certain denoising diffusion and score-matching models, which establish a well-defined change-of-variables between the source and target distributions. Given this correspondence, we propose a new probability flow model, the Lines Matching Model (LMM), which matches globally straight lines interpolating the two distributions. We demonstrate that the flow fields produced by the LMM exhibit notable temporal consistency, resulting in trajectories with excellent straightness scores. Beyond its sampling efficiency, the LMM formulation allows us to enhance the fidelity of the generated samples by integrating domain-specific reconstruction and adversarial losses, and by optimizing its training for the sampling procedure used. Overall, the LMM achieves state-of-the-art FID scores with minimal NFEs on established benchmark datasets: 1.57/1.39 (NFE=1/2) on CIFAR-10, 1.47/1.17 on ImageNet 64x64, and 2.68/1.54 on AFHQ 64x64. Finally, we provide a theoretical analysis showing that the use of optimal transport to relate the two distributions suffers from a curse of dimensionality, where the pairing set size (mini-batch) must scale exponentially with the signal dimension.

What problem does this paper attempt to address?

### What problem does this paper attempt to solve? This paper aims to solve a key problem encountered in the denoising process of diffusion models, namely the **singularity problem in training loss**. Specifically, this problem causes the prediction results of the denoising model to tend towards the mean of the source or target distribution, thus creating false basins of attraction, distorting the denoising trajectory, and ultimately increasing the number of steps required to sample these models. #### Specific manifestations of the problem: 1. **Denoising model degradation**: Under low signal - to - noise ratio (SNR) conditions, the uncertainty of the denoising loss intensifies, causing the denoising predictor's results to tend towards the mean of the source or target distribution. 2. **False basins of attraction**: This degradation phenomenon can cause the denoising trajectory to bend and distort, increasing the number of steps required for accurate sampling. 3. **Low computational efficiency**: Due to the need for more sampling steps, the computational cost increases significantly. #### Solution: To overcome this problem, the authors propose a new probability flow model - **Lines Matching Model (LMM)**. This model improves existing methods in the following ways: 1. **Utilizing deterministic ODE samplers**: LMM utilizes the deterministic ODE samplers provided by certain denoising diffusion models and score - matching models to establish an explicit variable transformation relationship between the source and target distributions. 2. **Global straight - line interpolation**: LMM performs global straight - line interpolation between the two distributions to ensure that the generated flow field has significant temporal consistency, thus producing trajectories with good straightness. 3. **Enhancing sample fidelity**: LMM improves the quality of generated samples by integrating domain - specific reconstruction losses and adversarial losses and optimizing its training to adapt to the sampling process used. 4. **Theoretical analysis**: The authors also provide theoretical analysis, showing that using Optimal Transport (OT) to correlate the two distributions is affected by the curse of dimensionality, where the mini - batch size must increase exponentially with the signal dimension. #### Experimental results: LMM has achieved state - of - the - art FID scores on multiple benchmark datasets and has achieved efficient sampling with the minimum NFEs (Number of Function Evaluations). For example, on the CIFAR - 10 dataset, LMM has reached FID scores of 1.57 and 1.39 at NFE = 1 and NFE = 2 respectively. In conclusion, this paper solves the singularity problem in the denoising process of diffusion models by proposing LMM, improving the sampling efficiency and the quality of generated samples.

Generative Lines Matching Models

Heavy-tailed denoising score matching

What's the score? Automated Denoising Score Matching for Nonlinear Diffusions

Nonlinear denoising score matching for enhanced learning of structured distributions

Local Flow Matching Generative Models

Discrete Flow Matching

From Denoising Diffusions to Denoising Markov Models

Maximum Likelihood Training for Score-Based Diffusion ODEs by High-Order Denoising Score Matching

Flow Map Matching

Wasserstein Flow Matching: Generative modeling over families of distributions

A Tale of Two Latent Flows: Learning Latent Space Normalizing Flow with Short-run Langevin Flow for Approximate Inference

To smooth a cloud or to pin it down: Guarantees and Insights on Score Matching in Denoising Diffusion Models

Learning Energy-Based Models in High-Dimensional Spaces with Multi-scale Denoising Score Matching

Flow Matching for Generative Modeling

Markovian Flow Matching: Accelerating MCMC with Continuous Normalizing Flows

Learning Energy-Based Models in High-Dimensional Spaces with Multiscale Denoising-Score Matching

Interpreting and Improving Diffusion Models from an Optimization Perspective

Metric Flow Matching for Smooth Interpolations on the Data Manifold

Non-Uniform Diffusion Models

Flow Matching in Latent Space

Score Mismatching for Generative Modeling