Towards Accurate Guided Diffusion Sampling through Symplectic Adjoint Method

Jiachun Pan,Hanshu Yan,Jun Hao Liew,Jiashi Feng,Vincent Y. F. Tan
2023-12-19
Abstract:Training-free guided sampling in diffusion models leverages off-the-shelf pre-trained networks, such as an aesthetic evaluation model, to guide the generation process. Current training-free guided sampling algorithms obtain the guidance energy function based on a one-step estimate of the clean image. However, since the off-the-shelf pre-trained networks are trained on clean images, the one-step estimation procedure of the clean image may be inaccurate, especially in the early stages of the generation process in diffusion models. This causes the guidance in the early time steps to be inaccurate. To overcome this problem, we propose Symplectic Adjoint Guidance (SAG), which calculates the gradient guidance in two inner stages. Firstly, SAG estimates the clean image via $n$ function calls, where $n$ serves as a flexible hyperparameter that can be tailored to meet specific image quality requirements. Secondly, SAG uses the symplectic adjoint method to obtain the gradients accurately and efficiently in terms of the memory requirements. Extensive experiments demonstrate that SAG generates images with higher qualities compared to the baselines in both guided image and video generation tasks.
Computer Vision and Pattern Recognition,Artificial Intelligence
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to achieve more accurate training - free guided sampling in diffusion models. Specifically, existing training - free guided sampling algorithms obtain the guiding energy function by one - step estimation of clean images. However, this method may be inaccurate in the early stages of the generation process because the pre - trained network is trained on clean images, and the early samples in the generation process are usually noisy. This leads to less accurate guidance in the early time steps. To overcome this problem, the paper proposes the **Symplectic Adjoint Guidance (SAG)** method, which calculates the gradient guidance through two internal stages: 1. **Multi - step estimation of clean images**: SAG estimates clean images through \(n\) function calls, where \(n\) is a flexible hyperparameter that can be adjusted according to specific image quality requirements. 2. **Symplectic adjoint method**: SAG uses the symplectic adjoint method to obtain gradients efficiently and accurately while having an advantage in terms of memory requirements. Through these two stages, SAG can generate higher - quality results in a variety of image and video generation tasks, including style - guided image generation, aesthetic improvement, personalized generation, and video stylization. Experimental results show that SAG performs better than baseline methods in these tasks.