Abstract:Recently, convolutional neural networks (CNNs) and Transformer-based Networks have exhibited remarkable prowess in the realm of remote sensing image super-resolution (RSISR), delivering promising results in the field. Nevertheless, the effective fusion of the inductive bias inherent in CNNs and the long-range modeling capabilities encapsulated within the Transformer architecture remains a relatively uncharted terrain in the context of RSISR endeavors. Accordingly, we propose an uncertainty-driven mixture convolution and transformer network (UMCTN) to earn a performance promotion. Specifically, to acquire multi-scale and hierarchical features, UMCTN adopts a U-shape architecture. Utilizing the dual-view aggregation block (DAB) based residual dual-view aggregation group (RDAG) in both encoder and decoder, we solely introduce a pioneering dense-sparse transformer group (DSTG) into the latent layer. This design effectively eradicates the considerable quadratic complexity inherent in vanilla Transformer structures. Moreover, we introduce a novel uncertainty-driven Loss (UDL) to steer the network's attention towards pixels exhibiting significant variance. The primary objective is to elevate the reconstruction quality specifically in texture and edge regions. Experimental outcomes on the UCMerced LandUse and AID datasets unequivocally affirm that UMCTN achieves state-of-the-art performance in comparison to presently prevailing methodologies.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is in the field of remote - sensing image super - resolution (RSISR), how to effectively fuse the local inductive bias of convolutional neural networks (CNNs) and the long - range modeling ability of the Transformer architecture to improve the reconstruction quality of remote - sensing images, especially in the detail restoration of texture and edge regions. Existing methods have deficiencies in dealing with these details. For example, traditional CNN - based methods perform poorly in capturing global structure information, while Transformer - based methods can model long - range dependencies but have high computational complexity and may ignore high - frequency details. To solve these problems, the authors propose an uncertainty - driven hybrid convolution and Transformer network (UMCTN). UMCTN efficiently extracts local detail information by introducing the Residual Dual - view Aggregation Group (RDAG), and uses the Dense - Sparse Transformer Block (DSTB) in the latent layer to model global structure information and non - local dependencies. In addition, the authors also introduce a new uncertainty - driven loss (UDL), which enables the network to focus on pixels with significant variances, especially in texture and edge regions, thereby improving the reconstruction quality of these regions. Specifically, the main contributions of UMCTN include: 1. Proposing a new RSISR method, UMCTN, which combines the advantages of CNNs and Transformers and integrates an adaptive loss mechanism. 2. Designing a hybrid feature exploration network aimed at effectively capturing and faithfully restoring high - frequency details in remote - sensing images. 3. Introducing an uncertainty - driven loss, enabling the network to dynamically focus on complex high - frequency regions and endowing the network with spatial adaptability. 4. Experimental results on two public datasets show that UMCTN performs well in both objective and subjective quality metrics, verifying its effectiveness. Through these innovations, UMCTN aims to overcome the limitations of existing methods and provide a more efficient and higher - quality solution for remote - sensing image super - resolution.

Uncertainty-driven mixture convolution and transformer network for remote sensing image super-resolution

Parallel-Connected Residual Channel Attention Network for Remote Sensing Image Super-Resolution

Epistemic-Uncertainty-Based Divide-and-Conquer Network for Single-Image Super-Resolution

Noucsr: Efficient Super-Resolution Network Without Upsampling Convolution

TransUNetCD: A Hybrid Transformer Network for Change Detection in Optical Remote-Sensing Images

Remote Sensing Image Super-Resolution Using Enriched Spatial-Channel Feature Aggregation Networks

A Unified Generative Adversarial Network With Convolution and Transformer for Remote Sensing Image Fusion

Efficient Adaptive Feature Fusion Network for Remote-Sensing Image Super-Resolution

An Efficient Hybrid CNN-Transformer Approach for Remote Sensing Super-Resolution

RSCNN: A CNN-Based Method to Enhance Low-Light Remote-Sensing Images

Single Remote Sensing Image Super-Resolution Via a Generative Adversarial Network with Stratified Dense Sampling and Chain Training

Cross-Spatial Pixel Integration and Cross-Stage Feature Fusion Based Transformer Network for Remote Sensing Image Super-Resolution

Enhanced Window-Based Self-Attention with Global and Multi-Scale Representations for Remote Sensing Image Super-Resolution

Hybrid-Scale Hierarchical Transformer for Remote Sensing Image Super-Resolution

Lightweight Single Image Super-Resolution via Efficient Mixture of Transformers and Convolutional Networks

DTCNet: Transformer-CNN Distillation for Super-Resolution of Remote Sensing Image

Contextual Transformation Network for Lightweight Remote-Sensing Image Super-Resolution

ConvMambaSR: Leveraging State-Space Models and CNNs in a Dual-Branch Architecture for Remote Sensing Imagery Super-Resolution

Multi-granularity Backprojection Transformer for Remote Sensing Image Super-Resolution

TTST: A Top-k Token Selective Transformer for Remote Sensing Image Super-Resolution

An Advanced Features Extraction Module for Remote Sensing Image Super-Resolution