RAP-SR: RestorAtion Prior Enhancement in Diffusion Models for Realistic Image Super-Resolution

Jiangang Wang,Qingnan Fan,Jinwei Chen,Hong Gu,Feng Huang,Wenqi Ren
2024-12-10
Abstract:Benefiting from their powerful generative capabilities, pretrained diffusion models have garnered significant attention for real-world image super-resolution (Real-SR). Existing diffusion-based SR approaches typically utilize semantic information from degraded images and restoration prompts to activate prior for producing realistic high-resolution images. However, general-purpose pretrained diffusion models, not designed for restoration tasks, often have suboptimal prior, and manually defined prompts may fail to fully exploit the generated potential. To address these limitations, we introduce RAP-SR, a novel restoration prior enhancement approach in pretrained diffusion models for Real-SR. First, we develop the High-Fidelity Aesthetic Image Dataset (HFAID), curated through a Quality-Driven Aesthetic Image Selection Pipeline (QDAISP). Our dataset not only surpasses existing ones in fidelity but also excels in aesthetic quality. Second, we propose the Restoration Priors Enhancement Framework, which includes Restoration Priors Refinement (RPR) and Restoration-Oriented Prompt Optimization (ROPO) modules. RPR refines the restoration prior using the HFAID, while ROPO optimizes the unique restoration identifier, improving the quality of the resulting images. RAP-SR effectively bridges the gap between general-purpose models and the demands of Real-SR by enhancing restoration prior. Leveraging the plug-and-play nature of RAP-SR, our approach can be seamlessly integrated into existing diffusion-based SR methods, boosting their performance. Extensive experiments demonstrate its broad applicability and state-of-the-art results. Codes and datasets will be available upon acceptance.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the insufficient restoration priors and inaccurate prompt - word activation in pre - trained diffusion models in the Real - World Image Super - Resolution (Real - SR) task. Specifically: 1. **Insufficient restoration priors**: Although existing pre - trained diffusion models have strong generation capabilities, they are not specifically designed for restoration tasks, and thus perform poorly in generating high - quality, high - resolution images with rich details. 2. **Inaccurate prompt - word activation**: Existing super - resolution methods based on diffusion models usually rely on manually - defined restoration prompt - words to activate restoration priors. However, natural language is often difficult to accurately describe the quality of images under various degradation conditions, resulting in insufficiently precise activation of restoration priors. To overcome these challenges, the paper proposes **RAP - SR** (RestorAtion Prior Enhancement in Diffusion Models for Realistic Image Super - Resolution), a new restoration prior enhancement method. This method mainly consists of two key parts: 1. **High - Fidelity Aesthetic Image Dataset (HFAID)**: 5,000 high - quality and aesthetically - valuable images are carefully selected through the Quality - Driven Aesthetic Image Selection Pipeline (QDAISP) to enhance the restoration priors of pre - trained models. 2. **Restoration prior enhancement framework**: It includes the Restoration Prior Refinement (RPR) and Restoration - Oriented Prompt - Word Optimization (ROPO) modules. RPR refines the restoration priors using HFAID, while ROPO strengthens the association between prompt - words and image quality by optimizing specific restoration identifiers, thereby activating the restoration priors more accurately. Through these methods, RAP - SR effectively bridges the gap between general models and the requirements of the Real - SR task and can be seamlessly integrated into existing super - resolution methods based on diffusion models to improve their performance. Experimental results show that RAP - SR achieves excellent results on multiple datasets, especially showing significant improvements in no - reference metrics such as MANIQA, MUSIQ, CLIPIQA, and BRISQUE.