CLR-Face: Conditional Latent Refinement for Blind Face Restoration Using Score-Based Diffusion Models

Maitreya Suin,Rama Chellappa

2024-02-09

Abstract:Recent generative-prior-based methods have shown promising blind face restoration performance. They usually project the degraded images to the latent space and then decode high-quality faces either by single-stage latent optimization or directly from the encoding. Generating fine-grained facial details faithful to inputs remains a challenging problem. Most existing methods produce either overly smooth outputs or alter the identity as they attempt to balance between generation and reconstruction. This may be attributed to the typical trade-off between quality and resolution in the latent space. If the latent space is highly compressed, the decoded output is more robust to degradations but shows worse fidelity. On the other hand, a more flexible latent space can capture intricate facial details better, but is extremely difficult to optimize for highly degraded faces using existing techniques. To address these issues, we introduce a diffusion-based-prior inside a VQGAN architecture that focuses on learning the distribution over uncorrupted latent embeddings. With such knowledge, we iteratively recover the clean embedding conditioning on the degraded counterpart. Furthermore, to ensure the reverse diffusion trajectory does not deviate from the underlying identity, we train a separate Identity Recovery Network and use its output to constrain the reverse diffusion process. Specifically, using a learnable latent mask, we add gradients from a face-recognition network to a subset of latent features that correlates with the finer identity-related details in the pixel space, leaving the other features untouched. Disentanglement between perception and fidelity in the latent space allows us to achieve the best of both worlds. We perform extensive evaluations on multiple real and synthetic datasets to validate the superiority of our approach.

Computer Science

What problem does this paper attempt to address?

The problem that this paper attempts to solve is how to maintain the identity characteristics of a person while restoring high - quality face images in Blind Face Restoration (BFR). Specifically, although the existing generative prior methods can restore high - quality face images from low - quality inputs, they face challenges in generating fine - grained facial details and often result in overly smooth outputs or altered identity information. These problems are mainly attributed to the typical trade - off between quality and resolution in the latent space: a highly compressed latent space makes the decoded output more robust to degradation but with poorer fidelity; while a more flexible latent space can better capture complex facial details but is extremely difficult to optimize for severely degraded images. To solve these problems, the authors propose a Conditional Latent Refinement strategy based on the diffusion model (CLR - Face), which utilizes the diffusion prior in the VQGAN architecture to learn the uncontaminated latent embedding distribution. Through this method, clean embeddings can be restored iteratively, and to ensure that the reverse diffusion trajectory does not deviate from the underlying identity, a separate Identity Recovery Network (IRN) is trained to constrain the reverse diffusion process. In addition, by using a learnable latent mask, the identity - related latent features can be selectively updated, thereby achieving the optimal balance between maintaining perceptual quality and identity information. In short, this paper aims to improve the balance between high - quality generation and identity preservation in the blind face restoration task, especially when dealing with severely degraded images, by introducing new diffusion models and latent space processing methods.

CLR-Face: Conditional Latent Refinement for Blind Face Restoration Using Score-Based Diffusion Models

3D Priors-Guided Diffusion for Blind Face Restoration

DiffBFR: Bootstrapping Diffusion Model Towards Blind Face Restoration

DifFace: Blind Face Restoration with Diffused Error Contraction

Towards Unsupervised Blind Face Restoration using Diffusion Prior

Towards Real-World Blind Face Restoration with Generative Diffusion Prior

Blind Face Restoration Via Integrating Face Shape and Generative Priors

DiffMAC: Diffusion Manifold Hallucination Correction for High Generalization Blind Face Restoration

AuthFace: Towards Authentic Blind Face Restoration with Face-oriented Generative Diffusion Prior

Enhancing Quality of Pose-varied Face Restoration with Local Weak Feature Sensing and GAN Prior

Toward Real-World Blind Face Restoration With Generative Diffusion Prior

BFRFormer: Transformer-based generator for Real-World Blind Face Restoration

Blind Face Restoration under Extreme Conditions: Leveraging 3D-2D Prior Fusion for Superior Structural and Texture Recovery

VQFR: Blind Face Restoration with Vector-Quantized Dictionary and Parallel Decoder

Real-World Blind Face Restoration with Generative Facial Prior and Degradation Simulation

Overcoming False Illusions in Real-World Face Restoration with Multi-Modal Guided Diffusion Model

PFStorer: Personalized Face Restoration and Super-Resolution

OSDFace: One-Step Diffusion Model for Face Restoration

Joint Generative Image Deblurring Aided by Edge Attention Prior and Dynamic Kernel Selection

Degradation learning and Skip-Transformer for blind face restoration

Towards Real-World Blind Face Restoration with Generative Facial Prior