3D Head Pose Estimation via Normal Maps: A Generalized Solution for Depth Image, Point Cloud, and Mesh

Jiang Wu,Hua Chen
DOI: https://doi.org/10.1002/aisy.202400159
IF: 7.298
2024-10-15
Advanced Intelligent Systems
Abstract:Pose orientation‐aware network (POANet), a lightweight model for 3D head pose estimation that analyzes diverse topological 3D data, is introduced. POANet outperforms state‐of‐the‐art method on the BIWI depth image dataset, reducing the mean absolute error by ≈30%. Additionally, it is the first solution to perform rigid head registration in a landmark‐free manner on mesh data. Head pose estimation plays a crucial role in various applications, including human–machine interaction, autonomous driving systems, and 3D reconstruction. Current methods address the problem primarily from a 2D perspective, which limits the efficient utilization of 3D information. Herein, a novel approach, called pose orientation‐aware network (POANet), which leverages normal maps for orientation information embedding, providing abundant and robust head pose information, is introduced. POANet incorporates the axial signal perception module and the rotation matrix perception module, these lightweight modules make the approach achieve state‐of‐the‐art (SOTA) performance with few computational costs. This method can directly analyze various topological 3D data without extensive preprocessing. For depth images, POANet outperforms existing methods on the Biwi Kinect head pose dataset, reducing the mean absolute error (MAE) by ≈30% compared to the SOTA methods. POANet is the first method to perform rigid head registration in a landmark‐free manner. It also incorporates few‐shot learning capabilities and achieves an MAE of about 1° on the Headspace dataset. These features make POANet a superior alternative to traditional generalized Procrustes analysis for mesh data processing, offering enhanced convenience for human phenotype studies.
What problem does this paper attempt to address?