Rank-based No-reference Quality Assessment for Face Swapping

Xinghui Zhou,Wenbo Zhou,Tianyi Wei,Shen Chen,Taiping Yao,Shouhong Ding,Weiming Zhang,Nenghai Yu
2024-06-04
Abstract:Face swapping has become a prominent research area in computer vision and image processing due to rapid technological advancements. The metric of measuring the quality in most face swapping methods relies on several distances between the manipulated images and the source image, or the target image, i.e., there are suitable known reference face images. Therefore, there is still a gap in accurately assessing the quality of face interchange in reference-free scenarios. In this study, we present a novel no-reference image quality assessment (NR-IQA) method specifically designed for face swapping, addressing this issue by constructing a comprehensive large-scale dataset, implementing a method for ranking image quality based on multiple facial attributes, and incorporating a Siamese network based on interpretable qualitative comparisons. Our model demonstrates the state-of-the-art performance in the quality assessment of swapped faces, providing coarse- and fine-grained. Enhanced by this metric, an improved face-swapping model achieved a more advanced level with respect to expressions and poses. Extensive experiments confirm the superiority of our method over existing general no-reference image quality assessment metrics and the latest metric of facial image quality assessment, making it well suited for evaluating face swapping images in real-world scenarios.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: how to accurately evaluate the quality of face - swapping images without reference images. Specifically, current full - reference image quality assessment (FR - IQA) methods rely on distance errors from the source or target image, which has limitations in practical applications, especially when the target face cannot be obtained. In addition, existing no - reference image quality assessment (NR - IQA) and facial image quality assessment (FIQA) methods perform poorly when evaluating face - swapping images because they often overlook the influence of non - identity attributes (such as lighting, expressions, etc.). To solve these problems, the authors propose a rank - based no - reference quality assessment method, aiming to more accurately evaluate the quality of face - swapping images by constructing a comprehensive large - scale dataset, implementing an image quality ranking method based on multiple face attributes, and combining a Siamese network for interpretable qualitative comparison. ### Main contributions 1. **Large - scale dataset**: Created a dataset containing more than 1 million face - swapping images, which were generated by five different face - swapping methods. The dataset contains rich ranking labels based on the consistency of face attribute vectors and perceptual similarity. 2. **New quality assessment metric**: Proposed a rank - based no - reference quality assessment method. Compared with other image quality assessment methods, this method can provide a comprehensive assessment consistent with human perception without reference images. 3. **Experimental verification**: Extensive experiments have proven that this method is significantly superior to other methods in evaluating the visual realism of deep - faked faces. In addition, adding this assessment metric as a loss function to the existing face - swapping model training process can significantly reduce attribute errors and improve image quality. ### Method overview - **Dataset construction**: Use high - resolution face images in the CelebAMask - HQ dataset as source and target images, and generate a large number of face - swapping images through five advanced face - swapping methods. - **Ranking label generation**: Generate loss values for multiple attributes through a 3D face reconstruction model (EMOCA), a pose estimation model (6DRepNet), and a perceptual similarity measure (LPIPS), and then generate ranking labels. - **Siamese network training**: Train the Siamese network using a ranking loss function with boundary awareness to learn the quality relationships between image pairs. This method not only improves the accuracy of face - swapping image quality assessment, but also enhances the robustness and generalization ability of the model, making it more suitable for practical application scenarios.