Dynamic Pre-training: Towards Efficient and Scalable All-in-One Image Restoration

Akshay Dudhane,Omkar Thawakar,Syed Waqas Zamir,Salman Khan,Fahad Shahbaz Khan,Ming-Hsuan Yang
2024-10-13
Abstract:All-in-one image restoration tackles different types of degradations with a unified model instead of having task-specific, non-generic models for each degradation. The requirement to tackle multiple degradations using the same model can lead to high-complexity designs with fixed configuration that lack the adaptability to more efficient alternatives. We propose DyNet, a dynamic family of networks designed in an encoder-decoder style for all-in-one image restoration tasks. Our DyNet can seamlessly switch between its bulkier and lightweight variants, thereby offering flexibility for efficient model deployment with a single round of training. This seamless switching is enabled by our weights-sharing mechanism, forming the core of our architecture and facilitating the reuse of initialized module weights. Further, to establish robust weights initialization, we introduce a dynamic pre-training strategy that trains variants of the proposed DyNet concurrently, thereby achieving a 50% reduction in GPU hours. Our dynamic pre-training strategy eliminates the need for maintaining separate checkpoints for each variant, as all models share a common set of checkpoints, varying only in model depth. This efficient strategy significantly reduces storage overhead and enhances adaptability. To tackle the unavailability of large-scale dataset required in pre-training, we curate a high-quality, high-resolution image dataset named Million-IRD, having 2M image samples. We validate our DyNet for image denoising, deraining, and dehazing in all-in-one setting, achieving state-of-the-art results with 31.34\% reduction in GFlops and a 56.75\% reduction in parameters compared to baseline models. The source codes and trained models are available at <a class="link-external link-https" href="https://github.com/akshaydudhane16/DyNet" rel="external noopener nofollow">this https URL</a>.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
### What problems does this paper attempt to solve? This paper aims to solve several key challenges in the image restoration (IR) task, specifically including: 1. **Multi - type Degradation Processing**: - **Problem Description**: Existing image restoration methods are usually for a single type of degradation (such as denoising, rain removal, fog removal, etc.), and specific models need to be trained for each degradation type. This results in high model complexity, lack of generality, and it is difficult to adapt to multiple unknown degradation types in practical applications. - **Solution**: The paper proposes a unified model (all - in - one model) that can handle multiple types of image degradation simultaneously without the need to design and train models separately for each degradation type. 2. **Model Efficiency and Flexibility**: - **Problem Description**: Traditional multi - task image restoration models often have a large number of parameters and high computational complexity, making it difficult to deploy on resource - constrained devices. While lightweight models improve efficiency, they may sacrifice performance. - **Solution**: The paper introduces a dynamic network (DyNet). Through the weight - sharing mechanism, the number of parameters is reduced, and it can be flexibly switched to different complexity model variants (such as the heavier DyNet - L and the lighter DyNet - S) after a single training, thus improving efficiency while maintaining high performance. 3. **Resource Consumption of Large - scale Pretraining**: - **Problem Description**: Large - scale pretraining is an effective means to improve model performance, but it requires a large amount of GPU time and storage space, which is a huge challenge for researchers and institutions with limited resources. - **Solution**: The paper proposes a dynamic pretraining strategy that can train multiple model variants simultaneously in a single pretraining process, significantly reducing GPU time (by about 50%), and reducing storage overhead through weight sharing. 4. **Lack of High - quality Datasets**: - **Problem Description**: Existing image restoration datasets are small in scale and cannot meet the needs of large - scale pretraining, limiting the improvement of model performance. - **Solution**: The paper constructs a high - quality, high - resolution image dataset named Million - IRD, which contains about 2 million images and is specifically used for large - scale pretraining in image restoration tasks. Through these innovations, the method proposed in the paper can not only efficiently handle multiple image degradation problems, but also achieve a good balance between resource utilization and performance.