SQformer: Spectral-Query Transformer for Hyperspectral Image Arbitrary-Scale Super-Resolution

Shuguo Jiang,Nanying Li,Meng Xu,Shuyu Zhang,Sen Jia
DOI: https://doi.org/10.1109/tgrs.2024.3463745
IF: 8.2
2024-10-04
IEEE Transactions on Geoscience and Remote Sensing
Abstract:Super-resolution is vital for the quality improvement of hyperspectral images (HSIs) under the spatial and spectral resolution trade-off. However, deep learning HSI super-resolution approaches typically adopt the "one model and one scale" scheme that is inefficient in training and storing. This is difficult in maximizing orbit equipment performance and aligning multiple spatial resolution data in remote sensing. Therefore, this article intends to address HSI arbitrary-scale super-resolution, enabling the scaling of HSIs to arbitrary sizes using a single model. To do this end, we treat HSI arbitrary-scale super-resolution as a retrieval problem. It conceptualizes the HSI as a dictionary of pixelwise tokens with spatial-spectral features, position information, and scale information. Its objective is to employ a set of initialized tokens related to the high-resolution (HR) HSI as queries to retrieve matched spectral features from low-resolution (LR) one, which is so-called token-based query-to-spectrum. Since these query tokens can be constructed flexibly (e.g., through random initialization), we can generate a desired number of them to reconstruct our HR HSI, thus achieving arbitrary-scale super-resolution. This process considers not only position information but also spectral features so that it can decrease spectral distortion. With the above idea, we developed an HSI arbitrary-scale super-resolution method, dubbed as spectral-query transformer (SQformer). Specifically, it begins by converting the LR HSI into a dictionary of LR tokens and then constructs a desired number of HR tokens. To enable flexible token construction, we design an implicit spectral token (particularly a learnable vector) and replicate it times to form the HR tokens. Next, the HR and LR tokens are passed into a transformer decoder to find the most matched spectral response for the former by soft-weighting the LR tokens. Finally, the HR tokens are spatially rearranged in order, forming an HR HSI. Extensive experiments have demonstrated its effectiveness on remote sensing data. The code will be released at: https://github.com/ShuGuoJ/SQformer.git.
imaging science & photographic technology,remote sensing,engineering, electrical & electronic,geochemistry & geophysics
What problem does this paper attempt to address?