Abstract:Face swapping has become a prominent research area in computer vision and image processing due to rapid technological advancements. The metric of measuring the quality in most face swapping methods relies on several distances between the manipulated images and the source image, or the target image, i.e., there are suitable known reference face images. Therefore, there is still a gap in accurately assessing the quality of face interchange in reference-free scenarios. In this study, we present a novel no-reference image quality assessment (NR-IQA) method specifically designed for face swapping, addressing this issue by constructing a comprehensive large-scale dataset, implementing a method for ranking image quality based on multiple facial attributes, and incorporating a Siamese network based on interpretable qualitative comparisons. Our model demonstrates the state-of-the-art performance in the quality assessment of swapped faces, providing coarse- and fine-grained. Enhanced by this metric, an improved face-swapping model achieved a more advanced level with respect to expressions and poses. Extensive experiments confirm the superiority of our method over existing general no-reference image quality assessment metrics and the latest metric of facial image quality assessment, making it well suited for evaluating face swapping images in real-world scenarios.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is: how to accurately evaluate the quality of face - swapping images without reference images. Specifically, current full - reference image quality assessment (FR - IQA) methods rely on distance errors from the source or target image, which has limitations in practical applications, especially when the target face cannot be obtained. In addition, existing no - reference image quality assessment (NR - IQA) and facial image quality assessment (FIQA) methods perform poorly when evaluating face - swapping images because they often overlook the influence of non - identity attributes (such as lighting, expressions, etc.). To solve these problems, the authors propose a rank - based no - reference quality assessment method, aiming to more accurately evaluate the quality of face - swapping images by constructing a comprehensive large - scale dataset, implementing an image quality ranking method based on multiple face attributes, and combining a Siamese network for interpretable qualitative comparison. ### Main contributions 1. **Large - scale dataset**: Created a dataset containing more than 1 million face - swapping images, which were generated by five different face - swapping methods. The dataset contains rich ranking labels based on the consistency of face attribute vectors and perceptual similarity. 2. **New quality assessment metric**: Proposed a rank - based no - reference quality assessment method. Compared with other image quality assessment methods, this method can provide a comprehensive assessment consistent with human perception without reference images. 3. **Experimental verification**: Extensive experiments have proven that this method is significantly superior to other methods in evaluating the visual realism of deep - faked faces. In addition, adding this assessment metric as a loss function to the existing face - swapping model training process can significantly reduce attribute errors and improve image quality. ### Method overview - **Dataset construction**: Use high - resolution face images in the CelebAMask - HQ dataset as source and target images, and generate a large number of face - swapping images through five advanced face - swapping methods. - **Ranking label generation**: Generate loss values for multiple attributes through a 3D face reconstruction model (EMOCA), a pose estimation model (6DRepNet), and a perceptual similarity measure (LPIPS), and then generate ranking labels. - **Siamese network training**: Train the Siamese network using a ranking loss function with boundary awareness to learn the quality relationships between image pairs. This method not only improves the accuracy of face - swapping image quality assessment, but also enhances the robustness and generalization ability of the model, making it more suitable for practical application scenarios.

Rank-based No-reference Quality Assessment for Face Swapping

Region-Aware Face Swapping

FaceSwapNet: Landmark Guided Many-to-Many Face Reenactment

MSTRIQ: No Reference Image Quality Assessment Based on Swin Transformer with Multi-Stage Fusion

Realistic and Efficient Face Swapping: A Unified Approach with Diffusion Models

No-reference image quality assessment based on global awareness

ReliableSwap: Boosting General Face Swapping Via Reliable Supervision

Designing One Unified Framework for High-Fidelity Face Reenactment and Swapping

A No-Reference Quality Assessment for Contrast-Distorted Image Based on Improved Learning Method.

Less is More: Learning Reference Knowledge Using No-Reference Image Quality Assessment

Visual Realism Assessment for Face-swap Videos

No-Reference Image Quality Assessment Combining Swin-Transformer and Natural Scene Statistics

Swapped Face Detection using Deep Learning and Subjective Assessment

SimSwap++: Towards Faster and High-Quality Identity Swapping

MobileFaceSwap: A Lightweight Framework for Video Face Swapping

A high-fidelity face swapping algorithm based on mutual information-guided feature decoupling

No-Reference Hyperspectral Image Quality Assessment via Ranking Feature Learning

No-reference Image Quality Assessment Based on Global and Local Content Perception.

Subjective Opinions Matter: Controllable Image Quality Assessment Using Pseudo Reference Images

Learning a No-Reference Quality Assessment Model of Enhanced Images With Big Data

No-Reference Quality Assessment of Contrast-Distorted Images using Contrast Enhancement