Abstract:Contrastive learning has achieved remarkable success on various high-level tasks, but there are fewer contrastive learning-based methods proposed for low-level tasks. It is challenging to adopt vanilla contrastive learning technologies proposed for high-level visual tasks to low-level image restoration problems straightly. Because the acquired high-level global visual representations are insufficient for low-level tasks requiring rich texture and context information. In this paper, we investigate the contrastive learning-based single image super-resolution from two perspectives: positive and negative sample construction and feature embedding. The existing methods take naive sample construction approaches (e.g., considering the low-quality input as a negative sample and the ground truth as a positive sample) and adopt a prior model (e.g., pre-trained VGG model) to obtain the feature embedding. To this end, we propose a practical contrastive learning framework for SISR, named PCL-SR. We involve the generation of many informative positive and hard negative samples in frequency space. Instead of utilizing an additional pre-trained network, we design a simple but effective embedding network inherited from the discriminator network which is more task-friendly. Compared with existing benchmark methods, we re-train them by our proposed PCL-SR framework and achieve superior performance. Extensive experiments have been conducted to show the effectiveness and technical contributions of our proposed PCL-SR thorough ablation studies. The code and pre-trained models can be found at <a class="link-external link-https" href="https://github.com/Aitical/PCL-SISR" rel="external noopener nofollow">this https URL</a>.

What problem does this paper attempt to address?

The paper primarily focuses on addressing issues in the Single Image Super-Resolution (SISR) task, particularly on how to leverage contrastive learning to improve existing SISR methods. Specifically, the paper aims to solve the following key problems: 1. **How to effectively generate positive and negative samples**: Traditional contrastive learning methods face challenges when dealing with low-level vision tasks because directly applying contrastive learning techniques used in high-level vision tasks often fails to capture the rich textures and contextual information required. To address this, the authors propose a new sample generation strategy, including generating multiple information-rich positive samples (by sharpening high-resolution images) and hard-to-distinguish negative samples (by slightly blurring high-resolution images), to encourage the network to learn more details. 2. **How to design a task-appropriate feature embedding network**: Existing methods typically rely on pre-trained VGG networks as feature embedding networks, which may not be ideal because VGG networks tend to extract high-level semantic information rather than task-specific information. Therefore, the paper proposes using the discriminator network in the super-resolution network as the feature embedding network, which is more suitable for the SISR task and can better capture detail changes. 3. **How to apply contrastive learning to the SISR task**: Through the aforementioned sample generation strategy and feature embedding network design, the paper constructs a practical contrastive learning framework (PCL-SR) to improve the quality of SISR results. This framework not only generates multiple positive samples and hard-to-distinguish negative samples but also uses multi-layer intermediate features to calculate contrastive loss, enabling the model to learn useful information at different levels. In summary, the core objective of the paper is to improve the quality of results in the Single Image Super-Resolution task by introducing a new contrastive learning framework, particularly in generating finer and more realistic images.

A Practical Contrastive Learning Framework for Single-Image Super-Resolution

Learning from History: Task-agnostic Model Contrastive Learning for Image Restoration

Criteria Comparative Learning for Real-Scene Image Super-Resolution

SCSNet: an Efficient Paradigm for Learning Simultaneously Image Colorization and Super-resolution

SRPGAN: Perceptual Generative Adversarial Network for Single Image Super Resolution.

Adaptive Loss Function for Super Resolution Neural Networks Using Convex Optimization Techniques

ESKN: Enhanced Selective Kernel Network for Single Image Super-Resolution

A Fast And Accurate Super-Resolution Network Using Progressive Residual Learning

Dual contrastive attention-guided deformable convolutional network for single image super-resolution

LAPAR: Linearly-Assembled Pixel-Adaptive Regression Network for Single Image Super-Resolution and Beyond

Exploiting Self-Supervised Constraints in Image Super-Resolution

Contrastive Learning with Synthetic Positives

Efficient Non-Local Contrastive Attention for Image Super-Resolution

An Attention-Based Approach for Single Image Super Resolution.

A Simple Framework for Contrastive Learning of Visual Representations

Global Learnable Attention for Single Image Super-Resolution

Transforming Image Super-Resolution: A ConvFormer-based Efficient Approach

Context-Aware Residual Network With Promotion Gates For Single Image Super-Resolution

ClearSR: Latent Low-Resolution Image Embeddings Help Diffusion-Based Real-World Super Resolution Models See Clearer

ICF-SRSR: Invertible scale-Conditional Function for Self-Supervised Real-world Single Image Super-Resolution

Image Super-Resolution With Cross-Scale Non-Local Attention and Exhaustive Self-Exemplars Mining