Improved Techniques for Maximum Likelihood Estimation for Diffusion ODEs

Kaiwen Zheng,Cheng Lu,Jianfei Chen,Jun Zhu
2024-04-06
Abstract:Diffusion models have exhibited excellent performance in various domains. The probability flow ordinary differential equation (ODE) of diffusion models (i.e., diffusion ODEs) is a particular case of continuous normalizing flows (CNFs), which enables deterministic inference and exact likelihood evaluation. However, the likelihood estimation results by diffusion ODEs are still far from those of the state-of-the-art likelihood-based generative models. In this work, we propose several improved techniques for maximum likelihood estimation for diffusion ODEs, including both training and evaluation perspectives. For training, we propose velocity parameterization and explore variance reduction techniques for faster convergence. We also derive an error-bounded high-order flow matching objective for finetuning, which improves the ODE likelihood and smooths its trajectory. For evaluation, we propose a novel training-free truncated-normal dequantization to fill the training-evaluation gap commonly existing in diffusion ODEs. Building upon these techniques, we achieve state-of-the-art likelihood estimation results on image datasets (2.56 on CIFAR-10, 3.43/3.69 on ImageNet-32) without variational dequantization or data augmentation, and 2.42 on CIFAR-10 with data augmentation. Code is available at \url{<a class="link-external link-https" href="https://github.com/thu-ml/i-DODE" rel="external noopener nofollow">this https URL</a>}.
Machine Learning
What problem does this paper attempt to address?
This paper addresses the issues with maximum likelihood estimation (MLE) for diffusion ordinary differential equations (ODEs) and proposes improved techniques to enhance its performance on image datasets. The research found that existing methods suffer from suboptimal likelihood estimation in both training and evaluation, leading to a gap compared to state-of-the-art likelihood-based generative models. To address this, the paper introduces an untrained truncated normal quantization method to reduce the training-evaluation gap and incorporates a weighted likelihood estimator for tighter bounds. On the training side, convergence is accelerated through speed parameterization and variance reduction techniques, while error-bounded high-order flow matching objectives are designed for fine-tuning and improving ODE trajectories. These improvements enable the model to achieve optimal likelihood estimation results on CIFAR-10 and ImageNet-32 without the need for data augmentation or variational quantization.