Abstract:In the field of computer vision, the pursuit of precise keypoint detection is constantly evolving, finding applications in diverse domains such as face recognition, facial region tracking, and facial expression analysis. While extensive research has been directed towards the visual spectrum, the potential of infrared imaging, rich in physiological cues, remains largely untapped due to limited annotated datasets. The thermal domain offers a unique avenue to extract intricate indicators of both mental and physical states in humans. A key component of autonomous mental state recognition is a robust face tracker capable of pinpointing facial landmarks with utmost precision. In this pursuit, we present a meticulously designed algorithm tailored for accurate facial keypoint detection in thermal images. We harnessed a dataset manually annotated with 68 facial keypoints across 94 subjects, providing a solid foundation for our exploration. To overcome the challenges posed by limited samples, we harnessed the cutting-edge paradigm of self-supervised learning. Guided by this approach, our Convolutional Neural Network (CNN) underwent comprehensive pretraining with three pretext tasks: image rotation prediction, subject classification, and 5-point facial keypoint detection. Notably, these tasks achieved high accuracies of 100%, 97.92%, and 2.37% mean absolute percentage error, respectively. The intrinsic knowledge distilled through these tasks enriched our network’s comprehension of thermal facial data, effectively unraveling distinctive features inherent to each countenance. With a fine-tuned CNN, we ventured into the domain of keypoint detection, strategically relinquishing the fully connected layers. This transformation, fortified by the insights gleaned from self-supervised pretraining, resulted in a discernible boost in keypoint detection precision. Demonstrating the efficacy of our approach, the achieved Normalized Mean Error (NME) stood at an impressive 1.05, distinctly surpassing the NME recorded through traditional fully supervised learning (NME=3.19). This outcome underscores the prowess of self-supervised learning in elevating the accuracy of keypoint detection, aligning with the broader theme of our work: capturing the intrinsic cues that guide the path to enhanced facial analysis.

Revisiting Facial Key Point Detection: An Efficient Approach Using Deep Neural Networks

Real-Time Facial Landmark Detection by Attention-driven Lightweight Network

Facial Recognition Using Convolutional Neural Network Using Real-Time Data

Facial Landmark Points Detection Using Knowledge Distillation-Based Neural Networks

Facial Keypoints Detection

DEEP TEXTURE FEATURES FOR ROBUST FACE SPOOFING DETECTION

A Performance Evaluation of Convolutional Neural Networks for Face Anti Spoofing

A Study on various state of the art of the Art Face Recognition System using Deep Learning Techniques

Detection, Segmentation and Recognition of Face and its Features Using Neural Network

Attendance Compilation by Facial Recognition Methods of Image Processing: A Review

Application Research on Face Image Evaluation Algorithm of Deep Learning Mobile Terminal for Student Check-In Management

Reinforced Feature Points: Optimizing Feature Detection and Description for a High-Level Task

Leveraging Self-Supervised Learning for Accurate Facial Keypoint Detection in Thermal Images

A lightweight 3D dense facial landmark estimation model from position map data

Deep Neural Network for Better Face Processing

Automated Cephalometric Landmark Detection: A Novel Software Model Compared with Manual Annotation Method

Face detection based on deep convolutional neural networks exploiting incremental facial part learning

Fast Facial Landmark Detection and Applications: A Survey

Utilizing Radiomic Feature Analysis For Automated MRI Keypoint Detection: Enhancing Graph Applications

The Optimization of Face Detection Technology Based on Neural Network and Deep Learning

Enhancement of Criminal Facial Image Using Multistage Progressive V-Net for Facial Recognition by Pixel Restoration