Raising The Limit Of Image Rescaling Using Auxiliary Encoding

Chenzhong Yin,Zhihong Pan,Xin Zhou,Le Kang,Paul Bogdan

2023-03-13

Abstract:Normalizing flow models using invertible neural networks (INN) have been widely investigated for successful generative image super-resolution (SR) by learning the transformation between the normal distribution of latent variable $z$ and the conditional distribution of high-resolution (HR) images gave a low-resolution (LR) input. Recently, image rescaling models like IRN utilize the bidirectional nature of INN to push the performance limit of image upscaling by optimizing the downscaling and upscaling steps jointly. While the random sampling of latent variable $z$ is useful in generating diverse photo-realistic images, it is not desirable for image rescaling when accurate restoration of the HR image is more important. Hence, in places of random sampling of $z$, we propose auxiliary encoding modules to further push the limit of image rescaling performance. Two options to store the encoded latent variables in downscaled LR images, both readily supported in existing image file format, are proposed. One is saved as the alpha-channel, the other is saved as meta-data in the image header, and the corresponding modules are denoted as suffixes -A and -M respectively. Optimal network architectural changes are investigated for both options to demonstrate their effectiveness in raising the rescaling performance limit on different baseline models including IRN and DLV-IRN.

Computer Vision and Pattern Recognition,Image and Video Processing

What problem does this paper attempt to address?

The paper primarily aims to address the issue of high-frequency information loss during image scaling (especially upsampling) and proposes two methods to improve image rescaling models based on Invertible Neural Networks (INN). Specifically: 1. **High-Frequency Information Storage**: Traditional image upsampling methods often lose high-frequency details, resulting in poor quality of the restored high-resolution (HR) images. Although some models like IRN utilize the bidirectional nature of invertible neural networks to optimize the downsampling and upsampling steps, they generate diverse photo-realistic images by randomly sampling latent variables $ z $, which is not ideal when precise recovery of HR images is needed. 2. **Auxiliary Encoding Modules**: To address the above issue, the authors propose a new Auxiliary Encoding Module for more effectively compressing and storing high-frequency information. This includes two methods: - **IRN-A**: Adds an extra alpha channel to the output low-resolution (LR) image to store the compressed high-frequency information. - **IRN-M**: Uses an autoencoder to compress the latent variable $ z $ into a compact latent variable and saves it as metadata of the image file. 3. **Experimental Validation**: A series of experiments were conducted to validate the effectiveness of these two methods. The results show that both methods significantly improve the image quality during the upsampling process, especially in terms of storing high-frequency information, thereby enhancing the quality of the final restored HR images. In summary, this paper aims to improve the handling of high-frequency information in image rescaling models to enhance the quality of image recovery during the upsampling process.

Raising The Limit Of Image Rescaling Using Auxiliary Encoding

Enhancing Image Rescaling using Dual Latent Variables in Invertible Neural Network

Invertible Residual Rescaling Models

Towards Extreme Image Rescaling with Generative Prior and Invertible Prior

Self-Asymmetric Invertible Network for Compression-Aware Image Rescaling

Scale-arbitrary Invertible Image Downscaling

Enhancing Perception Quality in Remote Sensing Image Compression via Invertible Neural Network

Invertible Rescaling Network and Its Extensions

Generating High Fidelity Images with Subscale Pixel Networks and Multidimensional Upscaling

Latent Modulated Function for Computational Optimal Continuous Image Representation

Invertible Resampling-Based Layered Image Compression.

Incremental Focal Loss GANs.

UltraSR: Spatial Encoding is a Missing Key for Implicit Image Function-based Arbitrary-Scale Super-Resolution

Arbitrary-Scale Image Generation and Upsampling using Latent Diffusion Model and Implicit Neural Decoder

Activating More Information in Arbitrary-Scale Image Super-Resolution

OPE-SR: Orthogonal Position Encoding for Designing a Parameter-free Upsampling Module in Arbitrary-scale Image Super-Resolution

Boosting of Implicit Neural Representation-based Image Denoiser

Enhanced Implicit Function-Based Network for Arbitrary-Scale Image Super-Resolution

One-step Generative Diffusion for Realistic Extreme Image Rescaling

Approximately Invertible Neural Network for Learned Image Compression

HIIF: Hierarchical Encoding based Implicit Image Function for Continuous Super-resolution