Exploiting Multi-scale Parallel Self-attention and Local Variation via Dual-branch Transformer-CNN Structure for Face Super-resolution
Jingang Shi,Yusi Wang,Zitong Yu,Guanxin Li,Xiaopeng Hong,Fei Wang,Yihong Gong
DOI: https://doi.org/10.1109/tmm.2023.3301225
IF: 7.3
2023-01-01
IEEE Transactions on Multimedia
Abstract:Recently, deep learning technique has been widely employed to deal with face super-resolution (FSR) problem. It aims to predict the nonlinear relationship between the low-resolution (LR) face images and corresponding high-resolution (HR) ones, which could recover the high-frequency details from the LR degraded textures. However, either CNN-based or Transformer-based approaches mostly enhance the details by exploiting the relationship of local pixels or patches on LR features, the nonlocal features are not fully taken into account for producing high-frequency textures. To improve the above problem, we design a novel dual-branch module which consists of Transformer and CNN respectively. The Transformer branch extracts multiple scale feature embeddings and explores local and nonlocal self-attention simultaneously. Thus, the parallel self-attention mechanism has superior capabilities to capture the local and nonlocal dependencies on face image in the face reconstruction. Furthermore, the traditional CNNs usually extract features by combining pixels in a local convolutional kernel, it may be not effective to recover lost high-frequency details since the variations of local pixels are not well measured, which is important in recovering vivid edges and contours. To this end, we propose the local variation based attention block on the CNN branch, which could enhance the capabilities by directly extracting features from the variation of neighboring pixels. Finally, the Transformer-branch and CNN-branch are combined together by the modulation block to fuse both nonlocal and local advantages from two branches. Experimental results demonstrate the effectiveness of the proposed method when compared with state-of-the-art approaches. The source code is available https://github.com/jingang-cv/DBTC.
computer science, information systems,telecommunications, software engineering