Progressive Face Super-Resolution via Attention to Facial Landmark

Deokyun Kim,Minseon Kim,Gihyun Kwon,Dae-Shik Kim
DOI: https://doi.org/10.48550/arXiv.1908.08239
2019-08-22
Abstract:Face Super-Resolution (SR) is a subfield of the SR domain that specifically targets the reconstruction of face images. The main challenge of face SR is to restore essential facial features without distortion. We propose a novel face SR method that generates photo-realistic 8x super-resolved face images with fully retained facial details. To that end, we adopt a progressive training method, which allows stable training by splitting the network into successive steps, each producing output with a progressively higher resolution. We also propose a novel facial attention loss and apply it at each step to focus on restoring facial attributes in greater details by multiplying the pixel difference and heatmap values. Lastly, we propose a compressed version of the state-of-the-art face alignment network (FAN) for landmark heatmap extraction. With the proposed FAN, we can extract the heatmaps suitable for face SR and also reduce the overall training time. Experimental results verify that our method outperforms state-of-the-art methods in both qualitative and quantitative measurements, especially in perceptual quality.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to keep facial details from being distorted during the process of magnifying a low - resolution (LR) face image to a high - resolution (HR) image when performing face super - resolution reconstruction. Specifically, the paper focuses on how to restore key facial features, such as details of the eyes, nose, mouth and other parts, while avoiding distortion and blurring caused by magnification. ### Background of the Paper Face Super - Resolution (SR) is a sub - field in the field of image super - resolution, specifically for the reconstruction of face images. The main challenge lies in how to restore clear and undistorted facial details while magnifying the image. Traditional super - resolution methods often lose small facial features when magnifying low - resolution images, resulting in a decline in the quality of the generated high - resolution images. ### Methods in the Paper To overcome these challenges, the author proposes the following key techniques: 1. **Progressive Training Method**: - By gradually increasing the complexity of the network, gradually transitioning from low - resolution to high - resolution, a stable training process is achieved. This method allows the network to generate higher - resolution images at each stage, gradually improving the image quality. 2. **Facial Attention Loss**: - A new loss function is introduced. By applying greater weights to the areas around facial key points, the network pays more attention to the detail restoration in these areas. The specific formula is as follows: \[ L_{\text{attention}}=\frac{1}{r^{2}WH}\sum_{x = 1}^{rW}\sum_{y = 1}^{rH}(M^{*}_{x,y}\cdot|I_{\text{HR}}_{x,y}-G(I_{\text{LR}})_{x,y}|) \] where \(G\) is the face super - resolution network, \(I_{\text{HR}}\) and \(I_{\text{LR}}\) are the target high - resolution image and the input low - resolution image respectively, and \(M^{*}\) is the landmark attention heat map. 3. **Distilled Face Alignment Network (FAN)**: - A compressed version of the face alignment network is proposed to extract landmark heat maps suitable for face super - resolution tasks. Through the hint - based training method, this network can significantly reduce the number of parameters while maintaining performance comparable to the original FAN. This not only improves the quality of the heat map but also greatly reduces the overall training time. ### Experimental Results The author conducted experiments on multiple datasets, including the CelebA and AFLW datasets. The experimental results show that this method is superior to the existing state - of - the - art methods in both qualitative and quantitative indicators. Especially in terms of perceptual quality, through the Mean - Opinion - Score (MOS) test, the images generated by this method are visually closer to the real images. ### Main Contributions 1. **Applying the Progressive Training Method to Face Super - Resolution for the First Time**: By gradually increasing the complexity of the network, stable and high - quality image generation is achieved. 2. **Introducing Facial Attention Loss**: It enables the network to restore facial details more accurately, especially in the areas around the landmarks. 3. **Proposing the Compressed FAN**: While reducing the number of parameters, it maintains performance comparable to the original FAN and significantly shortens the training time. In conclusion, through innovative methods and techniques, this paper effectively solves the key problems in face super - resolution and generates high - quality and undistorted high - resolution face images.