Relative Pose Consistency for Semi-Supervised Head Pose Estimation

Felix Kuhnke,Sontje Ihler,Jörn Ostermann,Jorn Ostermann
DOI: https://doi.org/10.1109/fg52635.2021.9666992
2021-12-15
Abstract:Human head pose estimation from images plays a vital role in applications like driver assistance systems and human behavior analysis. Head pose estimation networks are typically trained in a supervised manner. Unfortunately, manual/sensor-based annotations of head poses are prone to errors. A solution is supervised training on synthetic training data generated from 3D face models which can provide an infinite amount of perfect labels. However, computer generated face images only provide an approximation of real-world images which results in a domain gap between training and application domain. To date, domain adaptation is rarely addressed in current work on head pose estimation. In this work we propose relative pose consistency, a semi-supervised learning strategy for head pose estimation based on consistency regularization. It allows simultaneous learning on labeled synthetic data and unlabeled real-world data to overcome the domain gap, while keeping the advantages of synthetic data. Consistency regularization enforces consistent network predictions under random image augmentations. We address pose-preserving and pose-altering augmentations. Naturally, pose-altering augmentations cannot be used on unlabeled data. We therefore propose a strategy to exploit the relative pose introduced by pose-altering augmentations between augmented image pairs. This allows the network to benefit from relative pose labels during training on the unlabeled, real-world images. We evaluate our approach on a widely used benchmark (Biwi Kinect Head Pose) and outperform domain-adaptation SOTA. We are the first to present a consistency regularization framework for head pose estimation. Our experiments show that our approach improves head pose estimation accuracy for real-world images despite using only labels from synthetic images.
What problem does this paper attempt to address?