ResShift: Efficient Diffusion Model for Image Super-resolution by Residual Shifting

Zongsheng Yue,Jianyi Wang,Chen Change Loy
2023-10-18
Abstract:Diffusion-based image super-resolution (SR) methods are mainly limited by the low inference speed due to the requirements of hundreds or even thousands of sampling steps. Existing acceleration sampling techniques inevitably sacrifice performance to some extent, leading to over-blurry SR results. To address this issue, we propose a novel and efficient diffusion model for SR that significantly reduces the number of diffusion steps, thereby eliminating the need for post-acceleration during inference and its associated performance deterioration. Our method constructs a Markov chain that transfers between the high-resolution image and the low-resolution image by shifting the residual between them, substantially improving the transition efficiency. Additionally, an elaborate noise schedule is developed to flexibly control the shifting speed and the noise strength during the diffusion process. Extensive experiments demonstrate that the proposed method obtains superior or at least comparable performance to current state-of-the-art methods on both synthetic and real-world datasets, even only with 15 sampling steps. Our code and model are available at <a class="link-external link-https" href="https://github.com/zsyOAOA/ResShift" rel="external noopener nofollow">this https URL</a>.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The paper aims to address the issue of low inference speed in the field of image super-resolution (SR) with diffusion models. Specifically, existing diffusion models require hundreds or even thousands of sampling steps for image super-resolution processing, making the inference process very time-consuming. Although there are some techniques for accelerated sampling, these techniques often sacrifice performance to some extent, resulting in overly blurred super-resolution results. To solve this problem, the authors propose a new efficient diffusion model called ResShift, which constructs a Markov chain by transferring residuals between high-resolution and low-resolution images, significantly reducing the number of diffusion steps. This eliminates the need for post-acceleration during inference and its associated performance degradation. Additionally, the method develops a carefully designed noise scheduling mechanism that can flexibly control the speed of residual transfer and the intensity of noise during the diffusion process. Experimental results show that this method can achieve performance superior to or at least comparable to existing state-of-the-art methods on both synthetic and real-world datasets, even with only 15 sampling steps.