SWUNet: Swin Transformer Based UNet for Hyperspectral Reconstruction

B. Lall,Sadia Hussain
DOI: https://doi.org/10.1109/WHISPERS61460.2023.10431138
2023-10-31
Abstract:We introduce hyperspectral vision transformer: an architecture that leverages Swin Transformer as a backbone for HSI super-resolution task. While in RGB images a low-level vision problem encompasses estimating low-frequency structures and also reconstructing high frequency details. In addition to this, in HSI it becomes even more challenging due to the large number of spectral bands having high dimension and high correlation/redundancy between adjacent bands. This causes difficulty to super-resolve HSI. Many methods based on convolutional neural networks have been explored which resulted in various state-of-the-art (SOTA )results when extracting low-level concepts in an image. The attempts made with transformers show high quality performance on such tasks. In this paper, we propose a Unet inspired SWUNet transformer for hyperspectral image superresolution an encoder decoder structure with Swin transformer blocks. SWUNet consists of three parts: Convolutional extractor (for shallow features), Swin Transformer based encoder, Swin Transformer based decoder with a enhancing module (Pixel shuffle). This encoder-decoder ensures shifted window based self-attention in local windows as well as in cross-windows. This allows two varied window sets to capture attention along the spatial and spectral domain which expresses correlation for adjacent spectral efficiently for every spatial location. Experiments on two public HSI datasets express the strength of leveraging shifted windows in HSI.
Environmental Science,Engineering,Computer Science
What problem does this paper attempt to address?