Dissecting Arbitrary-scale Super-resolution Capability from Pre-trained Diffusion Generative Models

Ruibin Li,Qihua Zhou,Song Guo,Jie Zhang,Jingcai Guo,Xinyang Jiang,Yifei Shen,Zhenhua Han

2023-06-01

Abstract:Diffusion-based Generative Models (DGMs) have achieved unparalleled performance in synthesizing high-quality visual content, opening up the opportunity to improve image super-resolution (SR) tasks. Recent solutions for these tasks often train architecture-specific DGMs from scratch, or require iterative fine-tuning and distillation on pre-trained DGMs, both of which take considerable time and hardware investments. More seriously, since the DGMs are established with a discrete pre-defined upsampling scale, they cannot well match the emerging requirements of arbitrary-scale super-resolution (ASSR), where a unified model adapts to arbitrary upsampling scales, instead of preparing a series of distinct models for each case. These limitations beg an intriguing question: can we identify the ASSR capability of existing pre-trained DGMs without the need for distillation or fine-tuning? In this paper, we take a step towards resolving this matter by proposing Diff-SR, a first ASSR attempt based solely on pre-trained DGMs, without additional training efforts. It is motivated by an exciting finding that a simple methodology, which first injects a specific amount of noise into the low-resolution images before invoking a DGM's backward diffusion process, outperforms current leading solutions. The key insight is determining a suitable amount of noise to inject, i.e., small amounts lead to poor low-level fidelity, while over-large amounts degrade the high-level signature. Through a finely-grained theoretical analysis, we propose the Perceptual Recoverable Field (PRF), a metric that achieves the optimal trade-off between these two factors. Extensive experiments verify the effectiveness, flexibility, and adaptability of Diff-SR, demonstrating superior performance to state-of-the-art solutions under diverse ASSR environments.

Computer Vision and Pattern Recognition,Machine Learning,Image and Video Processing

What problem does this paper attempt to address?

The problem that this paper attempts to solve is the limitations of existing diffusion - generation models (DGMs) when dealing with arbitrary - scale super - resolution (ASSR) tasks. Specifically, traditional DGMs usually need to train models with specific architectures from scratch or perform iterative fine - tuning and distillation on pre - trained DGMs, which is not only time - consuming but also requires a large amount of hardware resources. More importantly, these models are usually built based on discrete predefined up - sampling ratios and cannot well adapt to the emerging ASSR requirements, that is, a unified model can adapt to any up - sampling ratio instead of preparing a series of different models for each situation. To overcome these problems, the paper proposes a method named Diff - SR, which is the first ASSR attempt based solely on pre - trained DGMs without additional training efforts. The core idea of Diff - SR is to inject a specific amount of noise into the low - resolution image and then invoke the reverse diffusion process of the DGM to restore the image. The key lies in determining the appropriate amount of noise injection to find the optimal balance between low - level fidelity and high - level features. For this purpose, the paper introduces the concept of Perceptual Recoverable Field (PRF) and verifies its effectiveness and flexibility through detailed theoretical analysis and experiments.

Dissecting Arbitrary-scale Super-resolution Capability from Pre-trained Diffusion Generative Models

Self-Reference Image Super-Resolution via Pre-trained Diffusion Large Model and Window Adjustable Transformer

SRDiff: Single image super-resolution with diffusion probabilistic models

Improving the Stability and Efficiency of Diffusion Models for Content Consistent Super-Resolution

EDiffSR: An Efficient Diffusion Probabilistic Model for Remote Sensing Image Super-Resolution

AdaDiffSR: Adaptive Region-aware Dynamic Acceleration Diffusion Model for Real-World Image Super-Resolution

ACDMSR: Accelerated Conditional Diffusion Models for Single Image Super-Resolution

CDPMSR: Conditional Diffusion Probabilistic Models for Single Image Super-Resolution

AddSR: Accelerating Diffusion-based Blind Super-Resolution with Adversarial Diffusion Distillation

Single image super-resolution with denoising diffusion GANS

A Conditional Diffusion Model With Fast Sampling Strategy for Remote Sensing Image Super-Resolution

Denoising Diffusion Probabilistic Model with Adversarial Learning for Remote Sensing Super-Resolution

Adaptive Semantic-Enhanced Denoising Diffusion Probabilistic Model for Remote Sensing Image Super-Resolution

DSR-Diff: Depth Map Super-Resolution with Diffusion Model

Latent Diffusion, Implicit Amplification: Efficient Continuous-Scale Super-Resolution for Remote Sensing Images

Diffusion Models, Image Super-Resolution And Everything: A Survey

Effective Diffusion Transformer Architecture for Image Super-Resolution

Diffusion Model with Detail Complement for Super-Resolution of Remote Sensing

Distillation-Free One-Step Diffusion for Real-World Image Super-Resolution