Abstract:Image deblurring continues to achieve impressive performance with the development of generative models. Nonetheless, there still remains a displeasing problem if one wants to improve perceptual quality and quantitative scores of recovered image at the same time. In this study, drawing inspiration from the research of transformer properties, we introduce the pretrained transformers to address this problem. In particular, we leverage deep features extracted from a pretrained vision transformer (ViT) to encourage recovered images to be sharp without sacrificing the performance measured by the quantitative metrics. The pretrained transformer can capture the global topological relations (i.e., self-similarity) of image, and we observe that the captured topological relationships about the sharp image will change when blur occurs. By comparing the transformer features between recovered image and target one, the pretrained transformer provides high-resolution blur-sensitive semantic information, which is critical in measuring the sharpness of the deblurred image. On the basis of the advantages, we present two types of novel perceptual losses to guide image deblurring. One regards the features as vectors and computes the discrepancy between representations extracted from recovered image and target one in Euclidean space. The other type considers the features extracted from an image as a distribution and compares the distribution discrepancy between recovered image and target one. We demonstrate the effectiveness of transformer properties in improving the perceptual quality while not sacrificing the quantitative scores peak signal-to-noise ratio (PSNR) over the most competitive models, such as Uformer, Restormer, and NAFNet, on defocus deblurring and motion deblurring tasks. The code is available at https://github. com/erfect2020/TransformerPerceptualLoss.

DeblurDiNAT: A Generalizable Transformer for Perceptual Image Deblurring

Image Deblurring by Exploring In-Depth Properties of Transformer

DMTNet: Dynamic Multi-scale Network for Dual-pixel Images Defocus Deblurring with Transformer

Revisiting Image Deblurring with an Efficient ConvNet

Real‐world image deblurring using data synthesis and feature complementary network

Rethinking Image Deblurring Via CNN-Transformer Multiscale Hybrid Architecture

An Efficient Dehazing Algorithm Based on the Fusion of Transformer and Convolutional Neural Network.

Bidirectional Transformer for Video Deblurring

SharpFormer: Learning Local Feature Preserving Global Representations for Image Deblurring

Broad Spectrum Image Deblurring via an Adaptive Super-Network

Image Deblurring With Image Blurring

MIMO-Uformer: A Transformer-Based Image Deblurring Network for Vehicle Surveillance Scenarios

HCTIRdeblur: A Hybrid Convolution-Transformer Network for Single Infrared Image Deblurring

VDTR: Video Deblurring with Transformer

Deep Robust Image Deblurring Via Blur Distilling and Information Comparison in Latent Space.

Image Deblurring Using Multi-Stream Bottom-Top-Bottom Attention Network and Global Information-Based Fusion and Reconstruction Network

SFFTNet: Sparse Feature Fusion Transformer Network for Image Deblurring

Deep self-supervised spatial-variant image deblurring

Deep Idempotent Network for Efficient Single Image Blind Deblurring

Efficient Dynamic Scene Deblurring Using Spatially Variant Deconvolution Network with Optical Flow Guided Training.

CNB Net: A Two-Stage Approach for Effective Image Deblurring