Transformer-based image super-resolution and its lightweight

Dongxiao Zhang,Tangyao Qi,Juhao Gao
DOI: https://doi.org/10.1007/s11042-024-18140-z
IF: 2.577
2024-01-25
Multimedia Tools and Applications
Abstract:Transformer has shown remarkable performance improvements over convolutional neural network (CNN) in natural language processing and high-level vision tasks. However, its application in low-level vision tasks, such as single image super-resolution (SISR), is still under-explored. In this paper, we introduce an up-down iterative algorithm and design a residual down and up Transformer block (RDUTB) in the Transformer framework. Then we propose a network for SISR based on RDUTB, which can effectively reconstruct low resolution (LR) images. Furthermore, to address the increasing demand for SISR models that can run on low-end mobile devices, we simplify the proposed model structure and adopt a content-based early-stopping strategy in the proposed SISR model to reduce the parameters and accelerate the reconstruction process while maintaining high quality. Experimental results show that our proposed Transformer-based SISR network and its lightweight version achieve superior performance over both traditional CNN-based SISR methods and some of the latest Transformer-based SISR methods.
computer science, information systems, theory & methods,engineering, electrical & electronic, software engineering
What problem does this paper attempt to address?