DeSRA: Detect and Delete the Artifacts of GAN-based Real-World Super-Resolution Models

Liangbin Xie,Xintao Wang,Xiangyu Chen,Gen Li,Ying Shan,Jiantao Zhou,Chao Dong
2023-07-06
Abstract:Image super-resolution (SR) with generative adversarial networks (GAN) has achieved great success in restoring realistic details. However, it is notorious that GAN-based SR models will inevitably produce unpleasant and undesirable artifacts, especially in practical scenarios. Previous works typically suppress artifacts with an extra loss penalty in the training phase. They only work for in-distribution artifact types generated during training. When applied in real-world scenarios, we observe that those improved methods still generate obviously annoying artifacts during inference. In this paper, we analyze the cause and characteristics of the GAN artifacts produced in unseen test data without ground-truths. We then develop a novel method, namely, DeSRA, to Detect and then Delete those SR Artifacts in practice. Specifically, we propose to measure a relative local variance distance from MSE-SR results and GAN-SR results, and locate the problematic areas based on the above distance and semantic-aware thresholds. After detecting the artifact regions, we develop a finetune procedure to improve GAN-based SR models with a few samples, so that they can deal with similar types of artifacts in more unseen real data. Equipped with our DeSRA, we can successfully eliminate artifacts from inference and improve the ability of SR models to be applied in real-world scenarios. The code will be available at <a class="link-external link-https" href="https://github.com/TencentARC/DeSRA" rel="external noopener nofollow">this https URL</a>.
Computer Vision and Pattern Recognition,Artificial Intelligence,Multimedia
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: Image super - resolution (SR) models based on generative adversarial networks (GANs) will produce unpleasant and unwanted artifacts in practical application scenarios, especially when processing unseen real - world test data. These artifacts seriously affect the visual quality of the restored image and reduce the user experience. Existing methods usually suppress artifacts by introducing additional loss penalties during the training phase, but these methods are only effective for in - distribution artifact types generated during training. When applied to real - world scenarios, the improved model will still generate obvious artifacts during the inference process. Specifically, the paper analyzes the causes and characteristics of artifacts generated by GAN - SR models on unseen test data and develops a new method named DeSRA to detect and remove these SR artifacts. The main steps of DeSRA include: 1. **Detecting artifact regions**: By calculating the relative local variance distance between MSE - SR results and GAN - SR results, locate problem areas, and further filter noise according to the semantic - aware threshold to generate the final artifact mask. 2. **Improving the GAN - SR model**: Based on the detected artifact regions, collect a small number of GAN - SR results with artifacts, replace the artifact regions with MSE - SR results, generate pseudo - ground - truth (pseudo GT). Then use these pseudo - ground - truths as training pairs to fine - tune the GAN - SR model to eliminate similar types of artifacts. Through DeSRA, the author successfully eliminates artifacts from the inference results and improves the application ability of the SR model in real - world scenarios. ### Formula Summary - Definition of relative local variance distance: \[ D=\frac{2\sigma_x\sigma_y}{\sigma_x^2+\sigma_y^2 + C} \] where $\sigma_x$ and $\sigma_y$ represent the standard deviations of MSE - SR and GAN - SR results in the local area respectively, and $C$ is a constant used to stabilize the denominator. - Detection map after semantic - aware adjustment: \[ M(i, j)= \begin{cases} 0 & \text{if }\frac{D(i, j)}{A_k}\geq\text{threshold}\\ 1 & \text{if }\frac{D(i, j)}{A_k}<\text{threshold} \end{cases} \] where $D(i, j)$ is the texture difference value of pixel $(i, j)$, $A_k$ is the adjustment weight of the $k$-th category, and $\text{threshold}$ is a hyperparameter. - Pseudo - ground - truth generation formula: \[ e_y = M\cdot y_{\text{MSE}}+(1 - M)\cdot y_{\text{GAN}} \] where $e_y$ represents the generated pseudo - ground - truth, $y_{\text{MSE}}$ and $y_{\text{GAN}}$ are the results of MSE - SR and GAN - SR respectively, and $M$ is the detected artifact mask. Through these methods, DeSRA effectively solves the artifact problem generated by GAN - SR models in real - world data and enhances the practical application value of the model.