EdgeFormer: Edge-aware Efficient Transformer for Image Super-resolution

Xiaotong Luo,Zekun Ai,Qiuyuan Liang,Yuan Xie,Zhongchao Shi,Jianping Fan,Yanyun Qu
DOI: https://doi.org/10.1109/tim.2024.3436070
IF: 5.6
2024-01-01
IEEE Transactions on Instrumentation and Measurement
Abstract:The imaging system of visual measurement equipment is usually affected by environment factors, such as distortion, blurring, and noise, which lead to the degradation of the acquired image. This article mainly studies on the image super-resolution (SR) technology with the vision transformer (ViT). However, it has the problem of high computational cost and large GPU memory, which hinders its application in image SR. The existing works mainly design lightweight network architectures for efficient inference, while ignoring the intrinsic image content resulting in wasting computing resources on unnecessary regions. In this article, we present an edge-aware high-efficiency transformer (EdgeFormer) for accurate image SR, which aims to perform self-attention (SA) on the informative edge and texture regions so as to significantly reduce the computational complexity. It consists of a sparse edge-aware pixel selector (SEPS) and a multiscale efficient transformer module (METM). SEPS is designed as a tiny side subnetwork to generate a binary mask indicating the position of edge or texture tokens, in which a sparse error-driven loss is introduced to further constrain the informative tokens in a more fine-grained way. Then, METM focuses on performing SA between the selective informative tokens. To effectively parallelize the execution, a cross-sample sliding window (CSSW) strategy is designed to make up for the uneven number of informative tokens for each sample. Our EdgeFormer can be combined with existing convolutional neural network (CNN)-based SR backbones to fully integrate the global and local context information. Extensive experimental results demonstrate that our EdgeFormer achieves obvious performance gain with fewer floating point of operations (FLOPs) compared with other models. The code is available at: https://github.com/xiaotongtt/EdgeFormer.
What problem does this paper attempt to address?