DSL-FIQA: Assessing Facial Image Quality via Dual-Set Degradation Learning and Landmark-Guided Transformer

Wei-Ting Chen,Gurunandan Krishnan,Qiang Gao,Sy-Yen Kuo,Sizhuo Ma,Jian Wang
2024-06-14
Abstract:Generic Face Image Quality Assessment (GFIQA) evaluates the perceptual quality of facial images, which is crucial in improving image restoration algorithms and selecting high-quality face images for downstream tasks. We present a novel transformer-based method for GFIQA, which is aided by two unique mechanisms. First, a Dual-Set Degradation Representation Learning (DSL) mechanism uses facial images with both synthetic and real degradations to decouple degradation from content, ensuring generalizability to real-world scenarios. This self-supervised method learns degradation features on a global scale, providing a robust alternative to conventional methods that use local patch information in degradation learning. Second, our transformer leverages facial landmarks to emphasize visually salient parts of a face image in evaluating its perceptual quality. We also introduce a balanced and diverse Comprehensive Generic Face IQA (CGFIQA-40k) dataset of 40K images carefully designed to overcome the biases, in particular the imbalances in skin tone and gender representation, in existing datasets. Extensive analysis and evaluation demonstrate the robustness of our method, marking a significant improvement over prior methods.
Computer Vision and Pattern Recognition,Artificial Intelligence,Image and Video Processing
What problem does this paper attempt to address?
The problem that this paper attempts to solve is General Facial Image Quality Assessment (GFIQA), that is, how to effectively evaluate the perceptual quality of facial images. This problem is crucial in multiple application scenarios, such as improving image inpainting algorithms and selecting high - quality facial images for downstream tasks. Specifically, the paper mainly focuses on the following challenges: 1. **Complex Facial Features**: The human face has complex visual features and expressions, and these factors greatly influence the perceptual quality. 2. **Difficulty of Subjective Scoring**: It is difficult to obtain subjective scores such as Mean Opinion Score (MOS), because the number of available facial images is limited and subjective evaluation itself is ambiguous. 3. **Facial Occlusion**: Facial occlusions caused by masks, accessories, etc., increase the complexity of the evaluation process. Existing methods have limitations when applied to facial images: - **General Image Quality Assessment (GIQA)**: Although it performs well on general images, it ignores the subtle features unique to the human face. - **Biometric Facial Quality Assessment (BFIQA)**: It focuses on ensuring the quality of facial images for robust biometric identification, but may not be able to accurately evaluate perceptual degradation. To solve these problems, the paper proposes a new Transformer - based method, DSL - FIQA, aiming to overcome the limitations of existing methods. This method introduces two unique mechanisms: 1. **Dual - Set Degradation Representation Learning (DSL)**: By using facial images with synthetic and real - world degradations to decouple degradation from content, ensuring the model's generalization ability in real - world scenarios. 2. **Key - point - Guided Transformer**: Utilize facial key points to emphasize the visually salient parts in facial images, thus more accurately evaluating the perceptual quality. In addition, the paper also introduces a new dataset, CGFIQA - 40k, which contains 40,000 images, aiming to overcome the problem of unbalanced skin color and gender distribution in existing datasets. In summary, this paper is committed to developing a more robust method for facial image quality assessment to address the above challenges and improve the evaluation performance.