Multiview Facial Landmark Localization in RGB-D Images Via Hierarchical Regression with Binary Patterns.

Zhanpeng Zhang,Wei Zhang,Jianzhuang Liu,Xiaoou Tang
DOI: https://doi.org/10.1109/tcsvt.2014.2308639
IF: 5.859
2014-01-01
IEEE Transactions on Circuits and Systems for Video Technology
Abstract:In this paper, we propose a real-time system of multiview facial landmark localization in RGB-D images. The facial landmark localization problem is formulated into a regression framework, which estimates both the head pose and the landmark positions. In this framework, we propose a coarse-to-fine approach to handle the high-dimensional regression output. At first, 3-D face position and rotation are estimated from the depth observation via a random regression forest. Afterward, the 3-D pose is refined by fusing the estimation from the RGB observation. Finally, the landmarks are located from the RGB observation with gradient boosted decision trees in a pose conditional model. The benefits of the proposed localization framework are twofold: the pose estimation and landmark localization are solved with hierarchical regression, which is different from previous approaches where the pose and landmark locations are iteratively optimized, which relies heavily on the initial pose estimation; due to the different characters of the RGB and depth cues, they are used for landmark localization at different stages and incorporated in a robust manner. In the experiments, we show that the proposed approach outperforms state-of-the-art algorithms on facial landmark localization with RGB-D input.
What problem does this paper attempt to address?