Abstract:Natural images captured by mobile devices often suffer from multiple types of degradation, such as noise, blur, and low light. Traditional image restoration methods require manual selection of specific tasks, algorithms, and execution sequences, which is time-consuming and may yield suboptimal results. All-in-one models, though capable of handling multiple tasks, typically support only a limited range and often produce overly smooth, low-fidelity outcomes due to their broad data distribution fitting. To address these challenges, we first define a new pipeline for restoring images with multiple degradations, and then introduce RestoreAgent, an intelligent image restoration system leveraging multimodal large language models. RestoreAgent autonomously assesses the type and extent of degradation in input images and performs restoration through (1) determining the appropriate restoration tasks, (2) optimizing the task sequence, (3) selecting the most suitable models, and (4) executing the restoration. Experimental results demonstrate the superior performance of RestoreAgent in handling complex degradation, surpassing human experts. Furthermore, the system modular design facilitates the fast integration of new tasks and models, enhancing its flexibility and scalability for various applications.

What problem does this paper attempt to address?

### What problems does this paper attempt to solve? This paper aims to solve the problem of high - quality restoration of natural images under various degradation conditions (such as noise, blurring, and low - light, etc.). Specifically, the paper points out the following limitations of traditional image restoration methods: 1. **Manual selection of tasks and models**: Traditional methods require manual selection of specific tasks, algorithms, and execution sequences, which is not only time - consuming but may also lead to sub - optimal results. 2. **Limitations of all - in - one models**: Although "all - in - one" models (all - in - one models) can handle multiple tasks, they usually support only a limited range of tasks, and due to their wide data distribution fitting, they often produce overly smooth, low - fidelity results. To solve these problems, the paper proposes a new image restoration framework - **RestoreAgent**, an intelligent image restoration system based on Multimodal Large Language Models (MLLM). The main goals of RestoreAgent are: - **Automatically evaluate the type and degree of degradation**: Automatically identify the type of degradation in the input image and its severity. - **Optimize the task sequence**: Determine the optimal task execution sequence to improve the restoration effect. - **Select the optimal model**: Dynamically select the most appropriate model from the available model library according to the specific degradation pattern. - **Automatically execute the restoration process**: Once the restoration sequence and model selection are determined, RestoreAgent can independently execute the entire restoration process without human intervention. Through these functions, RestoreAgent can more efficiently handle complex multi - degraded images, outperform human experts, and can quickly adapt to new tasks and models, enhancing the flexibility and scalability of the system. ### Formula summary To describe the problem, the paper defines a set \( D=\{d_1, d_2,\ldots, d_n\} \) containing multiple degradation types, where each \( d_i \) represents a specific type of image degradation (such as noise, JPEG artifacts, blurring, raindrop marks, fog, and low - light conditions). For each degradation type \( d_i \), there is a dedicated model library \( M_{d_i} \), containing multiple models \( \{M_{d_i}^1, M_{d_i}^2,\ldots\} \), and each model \( M_{d_i}^j \) is trained specifically to mitigate the degradation of type \( d_i \). The formal definition of the problem is as follows: - **Input**: A degraded image \( I \) affected by multiple degradation types \( D \), and a model library \( \{M_{d_1}, M_{d_2},\ldots, M_{d_n}\} \) for handling \( D \), and a user - provided scoring function \( S \) for evaluating the image restoration process. - **Target**: Find the optimal model execution sequence \( \sigma=(M_{a_1}^{b_1}, M_{a_2}^{b_2},\ldots, M_{a_m}^{b_m}) \) such that the restoration quality \( S \) of the degraded image \( I \) is maximized, that is: \[ \sigma^*=\arg\max_{\sigma\in S(D, M)} S(I, \sigma) \] where \( S(D, M) \) represents the set of all possible sequences of degradation types and model pairs. By solving this problem, the researchers hope to find the optimal combination of restoration sequences and model selections, thereby improving the quality of images affected by multiple degradations and providing more effective and efficient solutions for complex image restoration.

RestoreAgent: Autonomous Image Restoration Agent via Multimodal Large Language Models

Multi-modal Degradation Feature Learning for Unified Image Restoration Based on Contrastive Learning

Old Photo Restoration via Deep Latent Space Translation

LLMRA: Multi-modal Large Language Model based Restoration Assistant

An Intelligent Agentic System for Complex Image Restoration Problems

MOFA: A Model Simplification Roadmap for Image Restoration on Mobile Devices

Restorer: Removing Multi-Degradation with All-Axis Attention and Prompt Guidance

Harmony in Diversity: Improving All-in-One Image Restoration Via Multi-Task Collaboration

OneRestore: A Universal Restoration Framework for Composite Degradation

Compound Multi-branch Feature Fusion for Real Image Restoration

A Method for Remote Sensing Image Restoration Based on the System Degradation Model

Referring Flexible Image Restoration

Photo-Realistic Image Restoration in the Wild with Controlled Vision-Language Models

Training-Free Large Model Priors for Multiple-in-One Image Restoration

Restore Anything with Masks: Leveraging Mask Image Modeling for Blind All-in-One Image Restoration

Bringing Old Photos Back to Life

Chain-of-Restoration: Multi-Task Image Restoration Models are Zero-Shot Step-by-Step Universal Image Restorers

DreamClear: High-Capacity Real-World Image Restoration with Privacy-Safe Dataset Curation

Multi-Scale Representation Learning for Image Restoration with State-Space Model

Beyond Pixels: Text Enhances Generalization in Real-World Image Restoration

Multiscale Synergism Ensemble Progressive and Contrastive Investigation for Image Restoration