A Dynamic Network with Transformer for Image Denoising

Mingjian Song,Wenbo Wang,Yue Zhao

DOI: https://doi.org/10.3390/electronics13091676

IF: 2.9

2024-04-27

Electronics

Abstract:Deep convolutional neural networks (CNNs) can achieve good performance in image denoising due to their superiority in the extraction of structural information. However, they may ignore the relationships between pixels to limit effects for image denoising. Transformer, focusing on pixel to pixel relationships can effectively solve this problem. This article aims to make a CNN and Transformer complement each other in image denoising. In this study, we propose a dynamic network with Transformer for image denoising (DTNet), with a residual block (RB), a multi-head self-attention block (MSAB), and a multidimensional dynamic enhancement block (MDEB). Firstly, the RB not only utilizes a CNN but also lays the foundation for the combination with Transformer. Then, the MSAB adds positional encoding and applies multi-head self-attention, which enables the preservation of sequential positional information while employing the Transformer to obtain global information. Finally, the MDEB uses dimension enhancement and dynamic convolution to improve the adaptive ability. The experiments show that our DTNet is superior to some existing methods for image denoising.

engineering, electrical & electronic,computer science, information systems,physics, applied

What problem does this paper attempt to address?

The paper proposes a new method for image denoising. Specifically, it points out the limitations of traditional Convolutional Neural Networks (CNNs) in image denoising—while CNNs can effectively extract structural information from images, they may overlook the inter-pixel relationships, which limits their performance in image denoising tasks. On the other hand, the Transformer architecture can better address this issue due to its focus on inter-pixel relationships. To solve the above problem, the authors propose a new model that combines the advantages of CNNs and Transformers—Dynamic Network with Transformer for Image Denoising (DTNet). DTNet includes three main components: 1. **Residual Block (RB)**: It not only utilizes CNNs for local feature extraction but also segments the image through residual learning operations, preparing it for subsequent input to the Transformer. 2. **Multi-Head Self-Attention Block (MSAB)**: It adds positional encoding and applies a multi-head self-attention mechanism to retain the sequence order relationship while capturing global features. 3. **Multidimensional Dynamic Enhancement Block (MDEB)**: It improves adaptability and computational efficiency through dimensional enhancement and dynamic convolution. By integrating the structural feature extraction capability of CNNs and the understanding of inter-pixel relationships by Transformers, DTNet aims to improve image denoising performance. Experimental results show that DTNet outperforms some existing methods in image denoising. Additionally, the paper provides a detailed introduction to the model design, loss function selection, and specific implementation of each module.

A Dynamic Network with Transformer for Image Denoising

Dynamic Residual Dense Network For Image Denoising

Spatial-Adaptive Network for Single Image Denoising

Texture Compensation with Multi-Scale Dilated Residual Blocks for Image Denoising.

A cross Transformer for image denoising

Dilated kernel prediction network for

Image Denoising Via Multi-Scale Gated Fusion Network

Multinoise-type Blind Denoising Using a Single Uniform Deep Convolutional Neural Network.

DDT: Dual-branch Deformable Transformer for Image Denoising

DMTNet: Dynamic Multi-scale Network for Dual-pixel Images Defocus Deblurring with Transformer

Dilated Residual Encode-Decode Networks for Image Denoising

An Efficient Dehazing Algorithm Based on the Fusion of Transformer and Convolutional Neural Network.

CTFCD: Channel Transformer Based on Full Convolutional Decoder for Single Image Deraining

Efficient Lightweight Image Denoising with Triple Attention Transformer

SUMD: Super U-shaped Matrix Decomposition Convolutional neural network for Image denoising

Dual Residual Attention Network for Image Denoising

A Hybrid CNN for Image Denoising

Deep Convolutional Architecture for Natural Image Denoising

Multi-stage image denoising with the wavelet transform

Self-Supervised Image Denoising for Real-World Images with Context-aware Transformer

EWT: Efficient Wavelet-Transformer for Single Image Denoising