Abstract:In recent years, self-supervised denoising methods have shown impressive performance, which circumvent painstaking collection procedure of noisy-clean image pairs in supervised denoising methods and boost denoising applicability in real world. One of well-known self-supervised denoising strategies is the blind-spot training scheme. However, a few works attempt to improve blind-spot based self-denoiser in the aspect of network architecture. In this paper, we take an intuitive view of blind-spot strategy and consider its process of using neighbor pixels to predict manipulated pixels as an inpainting process. Therefore, we propose a novel Mask Guided Residual Convolution (MGRConv) into common convolutional neural networks, e.g. U-Net, to promote blind-spot based denoising. Our MGRConv can be regarded as soft partial convolution and find a trade-off among partial convolution, learnable attention maps, and gated convolution. It enables dynamic mask learning with appropriate mask constrain. Different from partial convolution and gated convolution, it provides moderate freedom for network learning. It also avoids leveraging external learnable parameters for mask activation, unlike learnable attention maps. The experiments show that our proposed plug-and-play MGRConv can assist blind-spot based denoising network to reach promising results on both existing single-image based and dataset-based methods.
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is to improve the performance of self - supervised image denoising methods based on blind - spots, especially by improving the network architecture to achieve this goal. Specifically, the paper proposes a new network module - Mask Guided Residual Convolution (MGRConv), aiming to promote blind - spot denoising methods based on single images and datasets, in order to make up for the performance degradation caused by noise modeling and training strategies.
### Background and Motivation of the Paper
In the image denoising task, self - supervised methods have attracted attention because they avoid the need to collect paired noisy - clean image pairs. Among them, the blind - spot training scheme is an effective self - supervised denoising strategy. By randomly occluding pixels and calculating the loss function only in these areas, it avoids the problem of learning the identity mapping. However, existing research mainly focuses on training strategies, and relatively few improvements have been made to the network architecture.
### Main Contributions
1. **New Perspective**: The paper regards the blind - spot denoising process as an inpainting task, that is, using neighboring pixels to predict the occluded pixels.
2. **MGRConv Module**: The Mask Guided Residual Convolution (MGRConv) module is proposed. This module combines the advantages of partial convolution, learnable attention maps and gated convolution, while avoiding their respective disadvantages.
3. **Performance Improvement**: Experimental results show that the MGRConv module can significantly improve the blind - spot denoising performance based on single images and datasets.
### Technical Details
- **MGRConv Module**:
- **Dynamic Mask Learning**: The MGRConv module guides information transfer by dynamically learning masks, avoiding the disadvantages of hard cropping and free - form training.
- **Residual Connection**: Through the residual connection, the MGRConv module avoids information loss and stabilizes the training process.
- **Formula Representation**:
\[
I' = I_c+\phi(I_c)\odot\sigma(M_c)
\]
\[
M'=\beta(M_c)
\]
where \(I'\) is the updated image feature, \(I_c\) and \(M_c\) are the convolved image feature and mask feature respectively, \(\phi\) is the activation function, \(\sigma\) is the Sigmoid function, and \(\beta\) is the mask update function.
- **Network Structure**:
- Based on the U - Net architecture, replace the standard convolution with the MGRConv module.
- In the Self2Self setting, insert a Dropout layer after each convolutional layer in the encoder.
- In the Noise2Void setting, directly use the architecture shown in Figure 1.
### Experimental Results
- **Synthetic Noise Images**:
- On a single - image basis, the MGRConv module significantly improves the denoising performance, especially at high noise levels.
- On a dataset basis, the MGRConv module also performs well, outperforming other existing methods.
- **Real - Noise Images**:
- The experimental results on the PolyU dataset further verify the effectiveness of the MGRConv module.
### Conclusion
By introducing the MGRConv module, the paper successfully improves the performance of self - supervised denoising methods based on blind - spots, providing new solutions for denoising tasks based on single images and datasets.