Abstract:The deep convolutional neural networks (CNNs) using attention mechanism have achieved great success for dynamic scene deblurring. In most of these networks, only the features refined by the attention maps can be passed to the next layer and the attention maps of different layers are separated from each other, which does not make full use of the attention information from different layers in the CNN. To address this problem, we introduce a new continuous cross-layer attention transmission (CCLAT) mechanism that can exploit hierarchical attention information from all the convolutional layers. Based on the CCLAT mechanism, we use a very simple attention module to construct a novel residual dense attention fusion block (RDAFB). In RDAFB, the attention maps inferred from the outputs of the preceding RDAFB and each layer are directly connected to the subsequent ones, leading to a CCLAT mechanism. Taking RDAFB as the building block, we design an effective architecture for dynamic scene deblurring named RDAFNet. The experiments on benchmark datasets show that the proposed model outperforms the state-of-the-art deblurring approaches, and demonstrate the effectiveness of CCLAT mechanism.
What problem does this paper attempt to address?
The problem that this paper attempts to solve is Dynamic Scene Deblurring. Specifically, the author points out that when dealing with motion blur in dynamic scenes, the existing methods based on deep Convolutional Neural Networks (CNN) have the following problems:
1. **Limitations of the attention mechanism**: Although most existing methods introduce the attention mechanism to enhance feature extraction, the attention maps between different layers are independent of each other, and the attention information of each layer is not fully utilized.
2. **Insufficient handling of non - uniform blur**: Since dynamic scenes in reality often contain complex non - uniform blurs (such as camera shake, fast - moving objects, etc.), the existing methods are not effective in dealing with this non - uniform blur.
To solve these problems, the author proposes a new Continuous Cross - Layer Attention Transmission mechanism (CCLAT), and based on this mechanism, designs a novel Residual Dense Attention Fusion Block (RDAFB), and then constructs an effective deblurring network architecture RDAFNet. Through these innovations, the paper aims to improve the effect and robustness of image deblurring in dynamic scenes.
### Main contributions
1. **CCLAT mechanism**: Fully utilize the hierarchical attention information of all convolutional layers through local dense connections.
2. **RDAFB module**: Based on the CCLAT mechanism, use a simple attention module to construct RDAFB, and use it as a building block to design RDAFNet.
3. **Experimental verification**: The experimental results on multiple benchmark datasets show that the proposed model is superior to the existing state - of - the - art deblurring methods, which proves the effectiveness of the CCLAT mechanism.
### Summary of mathematical formulas
- **Attention module**:
\[
M_{l,i}=\sigma(f_{1\times1}(F_{l,i}))
\]
where \(f_{1\times1}\) represents the \(1\times1\) convolution operation, and \(\sigma\) is the Sigmoid activation function.
- **Attention fusion module**:
\[
M'_{l,i}=D_{3\times3}(g_{1\times1}(\text{Cat}(M_{l,1},\dots,M_{l,i - 1},M_{l,i})))
\]
where \(\text{Cat}\) represents concatenating the attention maps in the channel dimension, \(g_{1\times1}\) represents the \(1\times1\) convolution operation, and \(D_{3\times3}\) represents the depth - separable convolution operation.
- **Loss function**:
\[
L=\sum_{S = 1}^{3}[L_c+\lambda L_f]
\]
where \(L_c\) is the L1 loss, \(L_f\) is the frequency loss, and \(\lambda\) is the weight parameter.
Through these improvements, the method proposed in the paper has achieved significant performance improvement in the dynamic scene deblurring task.