UTDM: a universal transformer-based diffusion model for multi-weather-degraded images restoration

Yongbo Yu,Weidong Li,Linyan Bai,Jinlong Duan,Xuehai Zhang
DOI: https://doi.org/10.1007/s00371-024-03659-x
IF: 2.835
2024-10-06
The Visual Computer
Abstract:Restoring multi-weather-degraded images is significant for subsequent high-level computer vision tasks. However, most existing image restoration algorithms only target single-weather-degraded images, and there are few general models for multi-weather-degraded image restoration. In this paper, we propose a diffusion model for multi-weather-degraded image restoration, namely a universal transformer-based diffusion model (UTDM) for multi-weather-degraded images restoration, by combining the denoising diffusion probability model and Vision Transformer (ViT). First, UTDM uses weather-degraded images as conditions to guide the diffusion model to generate clean background images through reverse sampling. Secondly, we propose a Cascaded Fusion Noise Estimation Transformer (CFNET) based on ViT, which utilizes degraded and noisy images for noise estimation. By introducing cascaded contextual fusion attention in a cascaded manner to compute contextual fusion attention mechanisms for different heads, CFNET explores the commonalities and characteristics of multi-weather-degraded images, fully capturing global and local feature information to improve the model's generalization ability on various weather-degraded images. UTDM outperformed the existing algorithm by 0.14–4.55,dB on the Raindrop-A test set, and improved by 0.99 dB and 1.24 dB compared with Transweather on the Snow100K-L and Test1 test sets. Experimental results show that our method outperforms general and specific restoration task algorithms on synthetic and real-world degraded image datasets. Code and dataset are available at: https://github.com/RHEPI/UTDM.
computer science, software engineering
What problem does this paper attempt to address?