Self-Reference Image Super-Resolution via Pre-trained Diffusion Large Model and Window Adjustable Transformer

Guangyuan Li,Wei Xing,Lei Zhao,Zehua Lan,Jiakai Sun,Zhanjie Zhang,Quanwei Zhang,Huaizhong Lin,Zhijie Lin
DOI: https://doi.org/10.1145/3581783.3611866
2023-01-01
Abstract:Currently, reference-based super-resolution (RefSR) techniques leverage high-resolution (HR) reference images to provide useful content and texture information for low-resolution (LR) images during the super-resolution (SR) process. Nevertheless, it is time-consuming, laborious, and even impossible in some cases to find high-quality reference images. To tackle this problem, we propose a brand-new self-reference image super-resolution approach using a pre-trained diffusion large model and a window adjustable transformer, termed DWTrans. Our proposed method does not require explicitly inputting manually acquired reference images during training and inference. Specifically, we feed the degraded LR images into a pre-trained stable diffusion large model to automatically generate corresponding high-quality self-reference (SRef) images that provide valuable high-frequency details for the LR images in the process of SR. To extract valuable high-frequency information in SRef images, we design a window adjustable transformer with both non-adjustable window layer (NWL) and adjustable window layer (AWL). The NWL learns local features from LR images using a dense window, while the AWL acquires global features from the SRef images using a random sparse window. Furthermore, to fully utilize the high-frequency features in the SRef image, we introduce the adaptive deformable fusion module to adaptively fuse the features of the LR and SRef images. Experimental results validate that our proposed DWTrans outperforms state-of-the-art methods on various benchmark datasets both quantitatively and visually.
What problem does this paper attempt to address?