A Unified Generative Adversarial Network With Convolution and Transformer for Remote Sensing Image Fusion

Yuanyuan Wu,Mengxing Huang
DOI: https://doi.org/10.1109/tgrs.2024.3441719
IF: 8.2
2024-08-25
IEEE Transactions on Geoscience and Remote Sensing
Abstract:Images derived from an individual sensor fail to simultaneously satisfy the demands of high spatial, spectral, and temporal resolutions. Multisource remote sensing image (RSI) fusion provides efficient access to high-spatial-resolution multispectral (HRMS) images [spatial-spectral fusion (SSF)] and high temporal- and spatial-resolution images [spatiotemporal fusion (STF)]. While existing deep learning (DL)-based models can mainly implement either SSF or STF, there is an urgent need for models that can simultaneously implement both SSF and STF. A unified generative adversarial network with convolution and Transformer (CTUGAN) for SSF and STF is proposed. CTUGAN contains a adaptive convolutional Transformer generator (ACTG) and multiresolution convolutional Transformer discriminator (MCTD), both with the convolution and Transformer. First, a bidirectional local-global feature encoder is devised in the ACTG to extract local-global features via a high-to-low resolution and a low-to-high resolution. Then, a multihead cross-attention fusion decoder (MCAFD) is devised to aggregate and fuse complementary local-global features of various levels and resolutions hierarchically to restore valuable information. Moreover, MCTDs adversely learn multiresolution local-global features to identify the relative reality of products, and a generalized loss function is built to accomplish full supervision. Finally, numerous experiments on the SSF data (Gaofen-2 (GF-2) and QuikBird) and STF data [Coleambally Irrigation Area (CIA) and lower Gwydir catchment (LGC)] demonstrate that the proposed CTUGAN model outperforms both subjective and objective evaluations.
imaging science & photographic technology,remote sensing,engineering, electrical & electronic,geochemistry & geophysics
What problem does this paper attempt to address?