Abstract:Remote sensing images are very vulnerable to cloud interference during the imaging process. Cloud occlusion, especially thick cloud occlusion, significantly reduces the imaging quality of remote sensing images, which in turn affects a variety of subsequent tasks using the remote sensing images. The remote sensing images miss ground information due to thick cloud occlusion. The thick cloud removal method based on a temporality global–local structure is initially suggested as a solution to this problem. This method includes two stages: the global multi-temporal feature fusion (GMFF) stage and the local single-temporal information restoration (LSIR) stage. It adopts the fusion feature of global multi-temporal to restore the thick cloud occlusion information of local single temporal images. Then, the featured global–local structure is created in both two stages, fusing the global feature capture ability of Transformer with the local feature extraction ability of CNN, with the goal of effectively retaining the detailed information of the remote sensing images. Finally, the local feature extraction (LFE) module and global–local feature extraction (GLFE) module is designed according to the global–local characteristics, and the different module details are designed in this two stages. Experimental results indicate that the proposed method performs significantly better than the compared methods in the established data set for the task of multi-temporal thick cloud removal. In the four scenes, when compared to the best method CMSN, the peak signal-to-noise ratio (PSNR) index improved by 2.675, 5.2255, and 4.9823 dB in the first, second, and third temporal images, respectively. The average improvement of these three temporal images is 9.65%. In the first, second, and third temporal images, the correlation coefficient (CC) index improved by 0.016, 0.0658, and 0.0145, respectively, and the average improvement for the three temporal images is 3.35%. Structural similarity (SSIM) and root mean square (RMSE) are improved 0.33% and 34.29%, respectively. Consequently, in the field of multi-temporal cloud removal, the proposed method enhances the utilization of multi-temporal information and achieves better effectiveness of thick cloud restoration.

Global–local transformer for single-image rain removal

CGMAformer: CNN and gated multi axial-sparse transformer feature fusion network for image deraining

Multi-Scale Dilated Convolution Transformer for Single Image Deraining

Image De-Raining Transformer

Towards an Effective and Efficient Transformer for Rain-by-snow Weather Removal

Hybrid CNN-Transformer Feature Fusion for Single Image Deraining

Local-to-Global Self-Attention in Vision Transformers

Gabor-guided transformer for single image deraining

CTFCD: Channel Transformer Based on Full Convolutional Decoder for Single Image Deraining

Rain Streak Removal Via Dual Graph Convolutional Network

LGFCTR: Local and Global Feature Convolutional Transformer for Image Matching

GLTF-Net: Deep-Learning Network for Thick Cloud Removal of Remote Sensing Images via Global–Local Temporality and Features

Hierarchical local global transformer for point clouds analysis

WaterFormer: A Global–Local Transformer for Underwater Image Enhancement With Environment Adaptor

Conformer: Local Features Coupling Global Representations for Visual Recognition

No rain in the world: A novel residual deep attention network for single image rain removal

Unifying Global-Local Representations in Salient Object Detection with Transformer

A Hybrid Transformer-Mamba Network for Single Image Deraining

An efficient multi‐scale transformer for satellite image dehazing

Dual-Path Multi-Scale Transformer for High-Quality Image Deraining

LSRFormer: Efficient Transformer Supply Convolutional Neural Networks With Global Information for Aerial Image Segmentation