Linearly-evolved Transformer for Pan-sharpening

Junming Hou,Zihan Cao,Naishan Zheng,Xuan Li,Xiaoyu Chen,Xinyang Liu,Xiaofeng Cong,Man Zhou,Danfeng Hong
2024-04-19
Abstract:Vision transformer family has dominated the satellite pan-sharpening field driven by the global-wise spatial information modeling mechanism from the core self-attention ingredient. The standard modeling rules within these promising pan-sharpening methods are to roughly stack the transformer variants in a cascaded manner. Despite the remarkable advancement, their success may be at the huge cost of model parameters and FLOPs, thus preventing its application over low-resource
Computer Vision and Pattern Recognition,Image and Video Processing
What problem does this paper attempt to address?
This paper aims to address the problem of pan-sharpening in satellite image fusion, particularly in reducing the demand for computational resources while maintaining high performance. Specifically: 1. **Problem Background**: Traditional pan-sharpening methods such as Convolutional Neural Networks (CNN) are effective but limited in application when computational resources are constrained. In recent years, Vision Transformers have made significant progress in the field of pan-sharpening, but their computational cost is high, making them difficult to apply in low-resource satellite scenarios. 2. **Research Objective**: To balance the contradiction between high performance and high computational cost, the authors propose a lightweight and efficient pan-sharpening framework by designing a Linearly-evolved Transformer, which significantly reduces the demand for computational resources while maintaining performance. 3. **Main Contributions**: - Propose a novel, lightweight, and efficient pan-sharpening framework that is comparable in performance to existing advanced methods but with lower computational cost. - Reveal the first-order principles of the self-attention mechanism and propose a linearly-evolved transformer chain to replace the common multi-layer transformer stacking method. - Provide an effective alternative for global modeling with efficient design characteristics, suitable for resource-constrained environments. Extensive experiments on multiple satellite datasets validate that this method achieves performance comparable to or even better than existing advanced methods with fewer computational resources. Additionally, the method also shows consistently good performance in hyperspectral image fusion tasks.