AMSA-UNet: An Asymmetric Multiple Scales U-net Based on Self-attention for Deblurring

Yingying Wang
2024-06-13
Abstract:The traditional ingle-scale U-Net often leads to the loss of spatial information during deblurring, which affects the deblurring accracy. Additionally, due to the convolutional method's limitation in capturing long-range dependencies, the quality of the recovered image is degraded. To address the above problems, an asymmetric multiple scales U-net based on self-attention (AMSA-UNet) is proposed to improve the accuracy and computational complexity. By introducing a multiple-scales U shape architecture, the network can focus on blurry regions at the global level and better recover image details at the local level. In order to overcome the limitations of traditional convolutional methods in capturing the long-range dependencies of information, a self-attention mechanism is introduced into the decoder part of the backbone network, which significantly increases the model's receptive field, enabling it to pay more attention to semantic information of the image, thereby producing more accurate and visually pleasing deblurred images. What's more, a frequency domain-based computation method was introduced to reduces the computation amount. The experimental results demonstrate that the proposed method exhibits significant improvements in both accuracy and speed compared to eight excellent methods
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The main problems that this paper attempts to solve are the loss of spatial information caused by the traditional single - scale U - Net in the process of image deblurring, and the limitations of the convolution method in capturing long - distance dependencies. These problems affect the accuracy of deblurring and the quality of the restored image. Specifically: 1. **Loss of spatial information**: When dealing with deblurring tasks, the traditional single - scale U - Net is prone to cause the loss of spatial information, thus affecting the accuracy of deblurring. 2. **Capturing long - distance dependencies**: Due to the limitations of the convolution method, it is difficult to effectively capture long - distance dependencies in the image, resulting in a decline in the quality of the restored image. To solve the above problems, the paper proposes an asymmetric multi - scale U - Net (AMSA - UNet) based on the self - attention mechanism. By introducing a multi - scale structure and the self - attention mechanism, this method improves the receptive field of the model and enhances the computational efficiency, thereby achieving more effective deblurring. ### Main improvement points: - **Multi - scale architecture**: By introducing a multi - scale U - shaped architecture, the network can focus on the blurred area at the global level and better restore image details at the local level. - **Self - attention mechanism**: Introducing the self - attention mechanism in the decoder part of the backbone network significantly increases the receptive field of the model, enabling it to pay more attention to the semantic information of the image, thereby generating more accurate and visually more satisfactory deblurred images. - **Frequency - domain calculation method**: Introducing a frequency - domain - based calculation method reduces the amount of calculation and further improves the efficiency of the model. ### Experimental results: The experimental results show that the proposed method is superior to the existing eight excellent methods in terms of both accuracy and speed. In particular, the test results on the GoPro dataset show that the average PSNR of AMSA - UNet reaches 30.55 dB, which is a significant improvement compared to other methods. In summary, this paper solves the shortcomings of the traditional U - Net in deblurring tasks by combining the multi - scale architecture and the self - attention mechanism, and significantly improves the deblurring effect and efficiency.