Abstract:Recent methods for 2D facial landmark localization perform well on close-to-frontal faces, but 2D landmarks are insufficient to represent 3D structure of a facial shape. For applications that require better accuracy, such as facial motion capture and 3D shape recovery, 3DA-2D (2D Projections of 3D Facial Annotations) is preferred. Inferring the 3D structure from a single image is an ill-posed problem whose accuracy and robustness are not always guaranteed. This paper aims to solve accurate 2D facial landmark localization and the transformation between 2D and 3DA-2D landmarks. One way to increase the accuracy is to input more precisely annotated facial images. The traditional cascaded regressions cannot effectively handle large or noisy training data sets. In this paper, we propose a Mini-Batch Cascaded Regressions (MBCR) method that can iteratively train a robust model from a large data set. Benefiting from the incremental learning strategy and a small learning rate, MBCR is robust to noise in training data. We also propose a new Cross-Dimension Annotations Conversion (CDAC) method to map facial landmarks from 2D to 3DA-2D coordinates and vice versa. The experimental results showed that CDAC combined with MBCR outperforms the-state-of-the-art methods in 3DA-2D facial landmark localization. Moreover, CDAC can run efficiently at up to 110 fps on a 3.4 GHz-CPU workstation. Thus, CDAC provides a solution to transform existing 2D alignment methods into 3DA-2D ones without slowing down the speed. Training and testing code as well as the data set can be downloaded from https://github.com/SWJTU-3DVision/CDAC.

Multiview Facial Landmark Localization in RGB-D Images Via Hierarchical Regression with Binary Patterns.

A Cross-Dimension Annotations Method for 3D Structural Facial Landmark Extraction

FDN: Feature Decoupling Network for Head Pose Estimation.

Hierarchical Facial Landmark Localization Via Cascaded Random Binary Patterns

Recurrent Volume-Based 3-D Feature Fusion for Real-Time Multiview Object Pose Estimation.

Recurrent Volume-based 3D Feature Fusion for Real-time Multi-view Object Pose Estimation

Facial landmark localization based on hierarchical pose regression with cascaded random ferns.

Long-Term Map-Based Visual Localization: Analysis of Individual Components of a Hierarchical Pipeline

SADRNet: Self-Aligned Dual Face Regression Networks for Robust 3D Dense Face Alignment and Reconstruction

Joint Head Pose and Facial Landmark Regression from Depth Images

Real-time Localization of 3D Facial Landmarks

Regression Forest Based RGB-D Visual Relocalization Using Coarse-to-Fine Strategy

Joint Voxel and Coordinate Regression for Accurate 3D Facial Landmark Localization

Accurate Rgb Camera Relocalization Using Regression Forest

Deep-Learning-Based Multiview RGBD Sensor System for 3-D Face Point Cloud Registration

Joint Multi-View Face Alignment in the Wild

Robust 3D face modeling and tracking from RGB-D images

Joint Multiview Segmentation And Localization Of Rgb-D Images Using Depth-Induced Silhouette Consistency

3-D Facial Landmarks Detection for Intelligent Video Systems

Learning a Deep Regression Forest for Head Pose Estimation from a Single Depth Image

Hierarchical visual localization for visually impaired people using multimodal images