Hierarchical Information Flow for Generalized Efficient Image Restoration

Yawei Li,Bin Ren,Jingyun Liang,Rakesh Ranjan,Mengyuan Liu,Nicu Sebe,Ming-Hsuan Yang,Luca Benini
2024-11-28
Abstract:While vision transformers show promise in numerous image restoration (IR) tasks, the challenge remains in efficiently generalizing and scaling up a model for multiple IR tasks. To strike a balance between efficiency and model capacity for a generalized transformer-based IR method, we propose a hierarchical information flow mechanism for image restoration, dubbed Hi-IR, which progressively propagates information among pixels in a bottom-up manner. Hi-IR constructs a hierarchical information tree representing the degraded image across three levels. Each level encapsulates different types of information, with higher levels encompassing broader objects and concepts and lower levels focusing on local details. Moreover, the hierarchical tree architecture removes long-range self-attention, improves the computational efficiency and memory utilization, thus preparing it for effective model scaling. Based on that, we explore model scaling to improve our method's capabilities, which is expected to positively impact IR in large-scale training settings. Extensive experimental results show that Hi-IR achieves state-of-the-art performance in seven common image restoration tasks, affirming its effectiveness and generalizability.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problems that this paper attempts to solve mainly focus on the following aspects: 1. **Lack of an efficient general - purpose image inpainting computing mechanism**: - Current image inpainting (IR) methods are usually designed for specific types of degradation problems and are difficult to effectively deal with many different types of degradation. For example, different denoising, deblurring, and super - resolution tasks require different model architectures and parameter settings. How to design a general - purpose framework that can be efficiently processed and generalized to multiple image inpainting tasks is a challenge. 2. **Lack of a systematic model expansion method**: - Existing image inpainting networks usually have a limited number of parameters (about 10 - 20M), and when expanding the model scale to deal with multiple degradation types, a simple increase in parameters often leads to a performance decline. Therefore, how to systematically expand the scale of the image inpainting model while maintaining or improving performance is another problem that needs to be urgently solved. 3. **Insufficient generalization ability of a single model in different image inpainting tasks**: - Existing methods usually focus on a single task or a few tasks and have not fully verified whether a single model can show good generalization ability in a wider range of image inpainting tasks. This requires a large number of experiments to verify the universality and adaptability of the model. To solve these problems, the paper proposes the following innovations: - **Introducing the Hierarchical Information Flow mechanism (Hi - IR)**: By constructing a tree - like information flow mechanism, Hi - IR can effectively aggregate and propagate information at multiple levels, thereby achieving an effective combination of global and local information. This mechanism not only improves the computational efficiency of the model but also enhances its adaptability to different types of image degradation. - **Exploring model expansion strategies**: To overcome the performance degradation problem caused by simply expanding the model, the paper proposes three strategies: training warm - up, replacing heavy - duty convolution operations, and selecting appropriate self - attention mechanisms (such as dot - product attention). These strategies significantly improve the convergence and performance of large - scale models. - **Verifying the generalization ability of the model**: Through a series of strict experiments, the paper verifies the performance of the proposed Hi - IR model in multiple image inpainting tasks, including down - sampling, motion blur, defocus blur, noise, and JPEG compression. The experimental results show that Hi - IR can achieve excellent performance in different tasks and intensities, proving its strong generalization ability. In conclusion, this paper aims to propose an efficient and general - purpose image inpainting framework and systematically solve the problems of insufficient model expansion and generalization ability, providing new ideas and technical means for the field of image inpainting.