Parallel Cross Strip Attention Network for Single Image Dehazing

Lihan Tong,Yun Liu,Tian Ye,Weijia Li,Liyuan Chen,Erkang Chen
2024-05-09
Abstract:The objective of single image dehazing is to restore hazy images and produce clear, high-quality visuals. Traditional convolutional models struggle with long-range dependencies due to their limited receptive field size. While Transformers excel at capturing such dependencies, their quadratic computational complexity in relation to feature map resolution makes them less suitable for pixel-to-pixel dense prediction tasks. Moreover, fixed kernels or tokens in most models do not adapt well to varying blur sizes, resulting in suboptimal dehazing performance. In this study, we introduce a novel dehazing network based on Parallel Stripe Cross Attention (PCSA) with a multi-scale strategy. PCSA efficiently integrates long-range dependencies by simultaneously capturing horizontal and vertical relationships, allowing each pixel to capture contextual cues from an expanded spatial domain. To handle different sizes and shapes of blurs flexibly, We employs a channel-wise design with varying convolutional kernel sizes and strip lengths in each PCSA to capture context information at different scales.Additionally, we incorporate a softmax-based adaptive weighting mechanism within PCSA to prioritize and leverage more critical features.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The paper attempts to address the problem of single image dehazing, which involves recovering a clear, high-quality image from a hazy one. Specifically, the paper focuses on the following two challenges: 1. **Long-distance dependency problem**: Traditional convolutional models struggle to capture long-distance dependencies in images due to their limited receptive fields. Although Transformers excel at capturing long-distance dependencies, their computational complexity is proportional to the square of the feature map resolution, making them less suitable for pixel-level dense prediction tasks. 2. **Fixed-size convolution kernel or patch problem**: Most existing models use fixed-size convolution kernels or patches, which leads to poor performance when dealing with blurs of different sizes and shapes. To address these issues, the authors propose a novel dehazing network based on Parallel Cross Strip Attention (PCSA) and incorporate a multi-scale strategy to flexibly handle blurs of different sizes and shapes. PCSA effectively integrates long-distance dependency information by simultaneously capturing relationships in both horizontal and vertical directions while maintaining low computational complexity. Additionally, the model employs a multi-scale design in the channel dimension to adapt to blurs of different sizes.