HazeCLIP: Towards Language Guided Real-World Image Dehazing

Ruiyi Wang,Wenhao Li,Xiaohong Liu,Chunyi Li,Zicheng Zhang,Xiongkuo Min,Guangtao Zhai
2024-07-19
Abstract:Existing methods have achieved remarkable performance in single image dehazing, particularly on synthetic datasets. However, they often struggle with real-world hazy images due to domain shift, limiting their practical applicability. This paper introduces HazeCLIP, a language-guided adaptation framework designed to enhance the real-world performance of pre-trained dehazing networks. Inspired by the Contrastive Language-Image Pre-training (CLIP) model's ability to distinguish between hazy and clean images, we utilize it to evaluate dehazing results. Combined with a region-specific dehazing technique and tailored prompt sets, CLIP model accurately identifies hazy areas, providing a high-quality, human-like prior that guides the fine-tuning process of pre-trained networks. Extensive experiments demonstrate that HazeCLIP achieves the state-of-the-art performance in real-word image dehazing, evaluated through both visual quality and no-reference quality assessments. The code is available: <a class="link-external link-https" href="https://github.com/Troivyn/HazeCLIP" rel="external noopener nofollow">this https URL</a> .
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problem this paper attempts to address is the poor performance of existing dehazing methods when dealing with real-world hazy images. Although current methods achieve significant performance on synthetic datasets, they struggle to maintain the same effectiveness in practical applications due to the domain shift issue. Specifically, the paper points out: 1. **Limitations of Existing Methods**: - **Gap Between Synthetic Data and the Real World**: Existing dehazing methods are primarily trained on synthetic datasets, which differ significantly from real-world hazy images, leading to poor performance in practical applications. - **Domain Shift Issue**: Due to the domain differences between synthetic and real-world data, existing dehazing models often fail to accurately identify and remove haze in real-world images. 2. **Objectives**: - **Improve Dehazing Performance on Real-World Images**: The paper proposes a new framework, HazeCLIP, aimed at enhancing the performance of pre-trained dehazing networks in the real world through language-guided methods. - **Utilize Vision-Language Models**: HazeCLIP leverages the capabilities of the Contrastive Language-Image Pre-training (CLIP) model to distinguish between hazy and clear images and guides the dehazing process through specific prompt sets. 3. **Innovations**: - **Language-Guided Adaptation Framework**: HazeCLIP introduces a language-guided adaptation framework that combines region-specific dehazing techniques with customized prompt sets to improve dehazing performance on real-world images. - **Region-Specific Dehazing Techniques**: To overcome the limitations of the CLIP model in identifying hazy regions, the paper proposes a region-specific dehazing technique that separately processes sky and non-sky regions, thereby enhancing dehazing accuracy. In summary, the main goal of this paper is to improve the performance of existing dehazing models on real-world images by introducing a language-guided adaptation framework, addressing the shortcomings of current methods in handling real-world hazy images.