Abstract:Image dehazing aims to restore clean images from hazy ones. Convolutional Neural Networks (CNNs) and Transformers have demonstrated exceptional performance in local and global feature extraction, respectively, and currently represent the two mainstream frameworks in image dehazing. In this paper, we propose a novel dual-branch image dehazing framework that guides CNN and Transformer components interactively. We reconsider the complementary characteristics of CNNs and Transformers by leveraging the differential relationships between global and local features for interactive guidance. This approach enables the capture of local feature positions through global attention maps, allowing the CNN to focus solely on feature information at effective positions. The single-branch Transformer design ensures the network's global information recovery capability. Extensive experiments demonstrate that our proposed method yields competitive qualitative and quantitative evaluation performance on both synthetic and real public datasets. Codes are available at <a class="link-external link-https" href="https://github.com/Feecuin/Two-Branch-Dehazing" rel="external noopener nofollow">this https URL</a>
What problem does this paper attempt to address?
The problem that this paper attempts to solve is image dehazing. Specifically, the author aims to propose a new dual - branch image dehazing framework by combining the advantages of convolutional neural networks (CNN) and Transformer, in order to restore the images affected by fog, making them clearer and closer to the original fog - free state. This problem is particularly important in practical tasks such as autonomous driving, object detection, and drone aerial photography, because fog can reduce the performance of the visual system.
### Problem Background
Fog is caused by the scattering of light by tiny particles in the atmosphere, which reduces the visibility of objects and further affects the performance of the visual system. Early image dehazing techniques mainly relied on prior assumptions derived from empirical knowledge, such as dark - channel prior, color - attenuation prior, etc. However, these prior - based methods are difficult to adapt to different scenarios and may produce artifacts in areas that do not satisfy the prior assumptions.
In recent years, dehazing methods based on deep learning have gradually become mainstream, and many of them use convolutional neural networks (CNN). Although CNN performs well in local feature extraction, its small receptive field limits the overall image restoration effect. At the same time, the Transformer model has made significant progress in computer vision tasks due to its excellent global feature extraction ability. However, a pure Transformer model may lead to unnecessary blurring and rough details when reconstructing images.
### Paper Solution
To solve the above problems, the author proposes an interaction - guided dual - branch image dehazing network. The main contributions of this network are as follows:
1. **Dual - branch Framework**: Use Transformer to extract global information and guide CNN to focus on effective local details, thereby improving the dehazing effect.
2. **Reduce Redundant Information**: By introducing down - sampling operations, distinguish the features extracted by the two branches, avoid redundant information caused by repeated extraction, and improve the model performance.
3. **Complementary Advantages**: Make full use of the complementary characteristics of CNN and Transformer to provide high - quality dehazing results while maintaining the effective use of computing resources.
### Model Structure
- **Global Perception Module**: Use the improved DehazeFormer module to extract global features, including the optimization of the normalization layer and the spatial aggregation scheme.
- **Local Perception Module**: Introduce CNN as another branch to extract local features. Through the channel and pixel attention mechanism (CPA), generate an attention map to guide CNN to extract detail information more effectively.
- **Decoder**: Restore image details through skip connections to ensure that the final output image has both global consistency and retains local details.
### Experimental Results
Experiments show that this method performs well on both synthetic datasets (such as RESIDE - 6K) and real - world datasets (such as NH - HAZE and DENSE - HAZE). Compared with existing advanced methods, this model not only performs excellently in quantitative evaluation metrics (such as PSNR, SSIM, entropy, and LPIPS), but also has a more natural and clear subjective visual effect.
### Conclusion
In summary, this paper proposes an interaction - guided dual - branch image dehazing network by combining the advantages of CNN and Transformer, effectively solves the problems existing in traditional methods, provides high - quality dehazing results, and verifies its superiority on multiple datasets.