Joint Human Detection and Head Pose Estimation Via Multistream Networks for RGB-D Videos

Guyue Zhang,Jun Liu,Hengduo Li,Yan Qiu Chen,Larry S. Davis
DOI: https://doi.org/10.1109/lsp.2017.2731952
2017-01-01
IEEE Signal Processing Letters
Abstract:We propose a multistream multitask deep network for joint human detection and head pose estimation in RGB-D videos. To achieve high accuracy, we jointly utilize appearance, shape, and motion information as inputs. Based on the depth information, we generate scale invariant proposals, which are then fed into a novel contextual region of interest pooling (CRP) layer in our deep network. This CRP has two branches to deal with contextual information for each subject. The proposed method outperforms state-of-the-art approaches on three public datasets.
What problem does this paper attempt to address?