Abstract:Recently, Super-Resolution (SR) achieved significant performance improvement by employing neural networks. Most SR methods conventionally train a single model for each targeted scale, which increases redundancy in training and deployment in proportion to the number of scales targeted. This paper challenges this conventional fixed-scale approach. Our preliminary analysis reveals that, surprisingly, encoders trained at different scales extract similar features from images. Furthermore, the commonly used scale-specific upsampler, Sub-Pixel Convolution (SPConv), exhibits significant inter-scale correlations. Based on these observations, we propose a framework for training multiple integer scales simultaneously with a single model. We use a single encoder to extract features and introduce a novel upsampler, Implicit Grid Convolution~(IGConv), which integrates SPConv at all scales within a single module to predict multiple scales. Our extensive experiments demonstrate that training multiple scales with a single model reduces the training budget and stored parameters by one-third while achieving equivalent inference latency and comparable performance. Furthermore, we propose IGConv$^{+}$, which addresses spectral bias and input-independent upsampling and uses ensemble prediction to improve performance. As a result, SRFormer-IGConv$^{+}$ achieves a remarkable 0.25dB improvement in PSNR at Urban100$\times$4 while reducing the training budget, stored parameters, and inference cost compared to the existing SRFormer.

What problem does this paper attempt to address?

### What problems does this paper attempt to solve? This paper aims to solve the redundancy problem in traditional super - resolution (SR) methods. Specifically, traditional SR methods usually train a separate model for each target scale, which leads to a great waste of resources during the training and deployment processes. For example, in order to support multiple scales (such as ×2, ×3 and ×4), multiple models need to be trained and stored separately, which significantly increases the training cost and storage requirements. #### Main problems: 1. **Redundancy in fixed - scale training**: Most SR methods use fixed - scale training, that is, training a specific model for each target scale. This method not only increases the training budget and the number of stored parameters, but also makes it more difficult to find the optimal hyper - parameters. 2. **Cross - scale feature similarity not fully utilized**: Although the features extracted by models at different scales are very similar, traditional methods do not fully utilize this point to simplify the model design and training process. 3. **Limitations of existing up - samplers**: The commonly used Sub - Pixel Convolution (SPConv) up - sampler has a significant correlation between different scales, but existing methods fail to effectively integrate these correlations to improve efficiency and performance. #### New methods proposed in the paper: To solve the above problems, the author proposes the following innovations: 1. **Multi - scale training framework**: By introducing Implicit Grid Convolution (IGConv), the author proposes a method that can train multiple integer scales simultaneously in a single model. This method reduces the training budget and the number of stored parameters while maintaining or improving performance. 2. **New up - sampler IGConv**: IGConv integrates the functions of SPConv at all scales into a unified module, and uses cross - scale correlations to predict the outputs of multiple scales. In addition, the author also proposes an improved version IGConv +, which further improves performance by introducing techniques such as frequency loss, implicit grid sampling (IGSample) and feature - level geometric re - parameterization (FGRep). 3. **Reduction of computational cost**: Compared with traditional SPConv and SPConv +, IGConv and IGConv + not only improve performance, but also show higher efficiency in terms of inference time and memory usage. Through these improvements, the author demonstrates the effectiveness and superiority of the multi - scale training framework and proves its wide application potential on a variety of datasets and models.

Implicit Grid Convolution for Multi-Scale Image Super-Resolution

NoUCSR: Efficient Super-Resolution Network Without Upsampling Convolution.

Epistemic-Uncertainty-Based Divide-and-Conquer Network for Single-Image Super-Resolution

Deep Arbitrary-Scale Image Super-Resolution Via Scale-Equivariance Pursuit

Activating More Information in Arbitrary-Scale Image Super-Resolution

Enhanced Implicit Function-Based Network for Arbitrary-Scale Image Super-Resolution

ASDN: A Deep Convolutional Network for Arbitrary Scale Image Super-Resolution

Image Superresolution using Scale-Recurrent Dense Network

UltraSR: Spatial Encoding is a Missing Key for Implicit Image Function-based Arbitrary-Scale Super-Resolution

Single Remote Sensing Image Super-Resolution Via a Generative Adversarial Network with Stratified Dense Sampling and Chain Training

Efficient Model Agnostic Approach for Implicit Neural Representation Based Arbitrary-Scale Image Super-Resolution

Arbitrary-Scale Image Generation and Upsampling using Latent Diffusion Model and Implicit Neural Decoder

Sub-Pixel Convolutional Neural Network for Image Super-Resolution Reconstruction

Transforming Image Super-Resolution: A ConvFormer-Based Efficient Approach

Single image super-resolution via deep progressive multi-scale fusion networks

A New Convolutional Neural Network for Super-Resolution by Global and Local Residual

Iterative Network for Image Super-Resolution

Efficient Single-image Super-resolution Using Dual Path Connections with Multiple Scale Learning

ITSRN++: Stronger and Better Implicit Transformer Network for Continuous Screen Content Image Super-Resolution

A2M: an Amplification-Arbitrary Module for Remote Sensing Image Super-Resolution