Abstract:Achieving a balance between spectral resolution and spatial resolution in multi-spectral remote sensing images is challenging due to physical constraints. Consequently, pan-sharpening technology was developed to address this challenge. While significant progress was recently achieved in deep-learning-based pan-sharpening techniques, most existing deep learning approaches face two primary limitations: (1) convolutional neural networks (CNNs) struggle with long-range dependency issues, and (2) significant detail loss during deep network training. Moreover, despite these methods' pan-sharpening capabilities, their generalization to full-sized raw images remains problematic due to scaling disparities, rendering them less practical. To tackle these issues, we introduce in this study a multi-spectral remote sensing image fusion network, termed TAMINet, which leverages a two-stream coordinate attention mechanism and multi-detail injection. Initially, a two-stream feature extractor augmented with the coordinate attention (CA) block is employed to derive modal-specific features from low-resolution multi-spectral (LRMS) images and panchromatic (PAN) images. This is followed by feature-domain fusion and pan-sharpening image reconstruction. Crucially, a multi-detail injection approach is incorporated during fusion and reconstruction, ensuring the reintroduction of details lost earlier in the process, which minimizes high-frequency detail loss. Finally, a novel hybrid loss function is proposed that incorporates spatial loss, spectral loss, and an additional loss component to enhance performance. The proposed methodology's effectiveness was validated through experiments on WorldView-2 satellite images, IKONOS, and QuickBird, benchmarked against current state-of-the-art techniques. Experimental findings reveal that TAMINet significantly elevates the pan-sharpening performance for large-scale images, underscoring its potential to enhance multi-spectral remote sensing image quality.

TAENet: transencoder-based all-in-one image enhancement with depth awareness

RT-VENet: A Convolutional Network for Real-time Video Enhancement.

CTFCD: Channel Transformer Based on Full Convolutional Decoder for Single Image Deraining

TIENet: task-oriented image enhancement network for degraded object detection

An Efficient Dehazing Algorithm Based on the Fusion of Transformer and Convolutional Neural Network.

Dilated Residual Encode-Decode Networks for Image Denoising

A Dynamic Network with Transformer for Image Denoising

Low-Light Image Enhancement by Combining Transformer and Convolutional Neural Network

VisionTwinNet: Gated Clarity Enhancement Paired With Light-Robust CD Transformers

CTHD-Net: CNN-Transformer hybrid dehazing network via residual global attention and gated boosting strategy

MTIE-Net: Multi-technology fusion of low-light image enhancement network

TSN-CA: A Two-Stage Network with Channel Attention for Low-Light Image Enhancement

PerNet: Progressive and Efficient All-in-One Image-Restoration Lightweight Network

DANet: A Domain Alignment Network for Low-Light Image Enhancement

TransDehaze: transformer-enhanced texture attention for end-to-end single image dehaze

All-in-one aerial image enhancement network for forest scenes

Illumination-Aware Low-Light Image Enhancement with Transformer and Auto-Knee Curve

Low-Light Image Enhancement via Stage-Transformer-Guided Network

A cross Transformer for image denoising

Pan-Sharpening Network of Multi-Spectral Remote Sensing Images Using Two-Stream Attention Feature Extractor and Multi-Detail Injection (TAMINet)

Advanced RetinexNet: A fully convolutional network for low-light image enhancement