Improving Facial Landmark Detection Accuracy and Efficiency with Knowledge Distillation

Zong-Wei Hong,Yu-Chen Lin

2024-04-09

Abstract:The domain of computer vision has experienced significant advancements in facial-landmark detection, becoming increasingly essential across various applications such as augmented reality, facial recognition, and emotion analysis. Unlike object detection or semantic segmentation, which focus on identifying objects and outlining boundaries, faciallandmark detection aims to precisely locate and track critical facial features. However, deploying deep learning-based facial-landmark detection models on embedded systems with limited computational resources poses challenges due to the complexity of facial features, especially in dynamic settings. Additionally, ensuring robustness across diverse ethnicities and expressions presents further obstacles. Existing datasets often lack comprehensive representation of facial nuances, particularly within populations like those in Taiwan. This paper introduces a novel approach to address these challenges through the development of a knowledge distillation method. By transferring knowledge from larger models to smaller ones, we aim to create lightweight yet powerful deep learning models tailored specifically for facial-landmark detection tasks. Our goal is to design models capable of accurately locating facial landmarks under varying conditions, including diverse expressions, orientations, and lighting environments. The ultimate objective is to achieve high accuracy and real-time performance suitable for deployment on embedded systems. This method was successfully implemented and achieved a top 6th place finish out of 165 participants in the IEEE ICME 2024 PAIR competition.

Computer Vision and Pattern Recognition

What problem does this paper attempt to address?

The main goal of this paper is to achieve efficient and accurate facial landmark detection on resource-constrained embedded systems. Specifically, the paper aims to address the following issues: 1. **Deploying complex deep learning models on embedded systems with limited computational resources**: Facial landmark detection requires high precision and real-time performance, but the computational power of embedded systems is limited, making it challenging to directly deploy large deep learning models. 2. **Robustness across multiple races and expressions**: Existing datasets often fail to comprehensively cover variations in different races and expressions, leading to insufficient generalization ability of the models in practical applications. To address these issues, the authors propose a knowledge distillation-based approach. By transferring the knowledge from large models (such as Swin Transformer) to smaller models (such as MobileViT-v2), they develop a lightweight yet efficient deep learning model. This approach not only improves the model's accuracy but also ensures its real-time performance on embedded devices. Experimental results show that the proposed MobileViT-v2 model performs excellently on the validation set and achieved 6th place in the IEEE ICME 2024 PAIR competition.

Improving Facial Landmark Detection Accuracy and Efficiency with Knowledge Distillation

Research on Knowledge Distillation Algorithm of Object Detection

Robust and efficient facial landmark localization

Efficient Facial Landmark Detection for Embedded Systems

Empowering Object Detection: Unleashing the Potential of Decoupled and Interactive Distillation

Facial Landmark Detection Via Attention-Adaptive Deep Network.

Multi-level knowledge distillation for low-resolution object detection and facial expression recognition

Improve Object Detection with Feature-based Knowledge Distillation: Towards Accurate and Efficient Detectors.

PFLD: A Practical Facial Landmark Detector.

Training Deep Face Recognition for Efficient Inference by Distillation and Mutual Learning

Structured Knowledge Distillation for Accurate and Efficient Object Detection

Two-in-one Knowledge Distillation for Efficient Facial Forgery Detection

Anatomical Landmark Detection Using a Feature-Sharing Knowledge Distillation-Based Neural Network

Facial Landmark Points Detection Using Knowledge Distillation-Based Neural Networks

Low-Resolution Face Recognition in the Wild Via Selective Knowledge Distillation

Distilling Object Detectors with Global Knowledge

Shared Knowledge Distillation Network for Object Detection

Pixel Distillation: A New Knowledge Distillation Scheme for Low-Resolution Image Recognition

A Multi-Teacher Assisted Knowledge Distillation Approach for Enhanced Face Image Authentication.

Exploring Effective Knowledge Distillation for Tiny Object Detection

Feature-Based Knowledge Distillation for Infrared Small Target Detection