Visual Style Prompt Learning Using Diffusion Models for Blind Face Restoration

Wanglong Lu,Jikai Wang,Tao Wang,Kaihao Zhang,Xianta Jiang,Hanli Zhao
DOI: https://doi.org/10.1016/j.patcog.2024.111312
2024-12-31
Abstract:Blind face restoration aims to recover high-quality facial images from various unidentified sources of degradation, posing significant challenges due to the minimal information retrievable from the degraded images. Prior knowledge-based methods, leveraging geometric priors and facial features, have led to advancements in face restoration but often fall short of capturing fine details. To address this, we introduce a visual style prompt learning framework that utilizes diffusion probabilistic models to explicitly generate visual prompts within the latent space of pre-trained generative models. These prompts are designed to guide the restoration process. To fully utilize the visual prompts and enhance the extraction of informative and rich patterns, we introduce a style-modulated aggregation transformation layer. Extensive experiments and applications demonstrate the superiority of our method in achieving high-quality blind face restoration. The source code is available at \href{<a class="link-external link-https" href="https://github.com/LonglongaaaGo/VSPBFR" rel="external noopener nofollow">this https URL</a>}{<a class="link-external link-https" href="https://github.com/LonglongaaaGo/VSPBFR" rel="external noopener nofollow">this https URL</a>}.
Computer Vision and Pattern Recognition,Multimedia
What problem does this paper attempt to address?