Generative Adversarial Network on Motion-Blur Image Restoration

Zhengdong Li
2024-12-27
Abstract:In everyday life, photographs taken with a camera often suffer from motion blur due to hand vibrations or sudden movements. This phenomenon can significantly detract from the quality of the images captured, making it an interesting challenge to develop a deep learning model that utilizes the principles of adversarial networks to restore clarity to these blurred pixels. In this project, we will focus on leveraging Generative Adversarial Networks (GANs) to effectively deblur images affected by motion blur. A GAN-based Tensorflow model is defined, training and evaluating by GoPro dataset which comprises paired street view images featuring both clear and blurred versions. This adversarial training process between Discriminator and Generator helps to produce increasingly realistic images over time. Peak Signal-to-Noise Ratio (PSNR) and Structural Similarity Index Measure (SSIM) are the two evaluation metrics used to provide quantitative measures of image quality, allowing us to evaluate the effectiveness of the deblurring process. Mean PSNR in 29.1644 and mean SSIM in 0.7459 with average 4.6921 seconds deblurring time are achieved in this project. The blurry pixels are sharper in the output of GAN model shows a good image restoration effect in real world applications.
Computer Vision and Pattern Recognition,Image and Video Processing
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the motion blur in daily - life photos caused by hand - shaking or sudden movement. This phenomenon will significantly reduce the image quality. Therefore, it is an interesting challenge to develop a deep - learning model that can utilize the principle of adversarial networks to restore the clarity of these blurred pixels. Specifically, the author focuses on using generative adversarial networks (GANs) to effectively remove the blurring effect of motion - blurred images. Training and evaluation are carried out through the GoPro dataset, which contains paired street - view images, both clear and blurred versions. The competition between the generator and the discriminator during the adversarial training process helps to generate more and more realistic images over time. In order to quantitatively measure the image quality and evaluate the effectiveness of the deblurring process, the author uses two evaluation metrics: peak signal - to - noise ratio (PSNR) and structural similarity index measurement (SSIM). The experimental results show that in this project, the average PSNR reaches 29.1644, the average SSIM reaches 0.7459, and the average deblurring time per image is 4.6921 seconds. ### Formula Summary 1. **Adversarial Loss Function**: \[ \min_G \max_D V(D, G)=\mathbb{E}_{x \sim p_{data}(x)}[\log D(x)]+\mathbb{E}_{z \sim p_z(z)}[\log (1 - D(G(z)))] \] where: - \( G \) is the generator function, which generates an image from the blurred image \( z \) in the GoPro dataset. - \( D \) is the discriminator function, which distinguishes between real and generated images. - \( V(D, G) \) is the value function of the adversarial game between the generator \( G \) and the discriminator \( D \). - \( \mathbb{E}_x \) represents the expectation of the real image \( x \). - \( D(x) \) is the probability that the discriminator correctly identifies the real image. - \( \mathbb{E}_z \) represents the expectation of the blurred image \( z \). - \( G(z) \) is the image generated from the blurred image \( z \). - \( D(G(z)) \) is the probability that the discriminator identifies the generated image as a real image. - \( \log D(x) \) is the logarithmic probability that the discriminator correctly identifies the real image. - \( \log (1 - D(G(z))) \) is the logarithmic probability that the discriminator incorrectly identifies the generated image as a real image. 2. **Perceptual Loss Function**: \[ L_{\text{perceptual}}(y_{\text{true}}, y_{\text{pred}})=\frac{1}{N} \sum_{i = 1}^{N}\|\phi(y_{\text{true}})-\phi(y_{\text{pred}})\|_2 \] where: - \( y_{\text{true}} \) is the real image. - \( y_{\text{pred}} \) is the generated image. - \( \phi \) is the feature extraction model (such as VGG16) applied to the image. - \( N \) is the number of features in the output of the layer 'block3_conv3'. - \( \|\cdot\|_2 \) represents the Euclidean norm. ### Conclusion This research demonstrates the effectiveness of using GANs for image deblurring, especially making significant progress in improving image clarity and structural fidelity. Although the performance on some custom - blurred images is not as expected, the overall results still prove the potential of GANs in solving the motion - blur problem in the real world. Future work will focus on improving the model architecture, using larger - scale datasets, and exploring different network structures.