HeadNet: an End-to-End Adaptive Relational Network for Head Detection

Wei Li,Hongliang Li,Qingbo Wu,Fanman Meng,Linfeng Xu,King Ngi Ngan
DOI: https://doi.org/10.1109/tcsvt.2019.2890840
IF: 5.859
2020-01-01
IEEE Transactions on Circuits and Systems for Video Technology
Abstract:Head detection plays an important role in localizing and identifying persons from visual data. Most existing methods treat head detection as a specific form of object detection. Head detection is nontrivial due to the considerable difficulty in building the local and global information under conditions of unconstrained pose and orientation. To address these issues, this paper presents an effective adaptive relational network to capture context information, which is greatly helpful to suppress missed detection. We show that the fundamental contextual properties, such as the global shape priors from different heads and the local adjacent relationship between the head and shoulders, can be systematically quantified by visual operators. Specifically, we propose a two-step search algorithm to quantify the global intergroup conflict with adaptive scale, pose and viewpoint. Meanwhile, a structured feature module is introduced to capture the local relation of intraindividual stability. Finally, the global priors and local relation are integrated seamlessly into a single-stage head detector that is end-to-end trainable. An extensive ablation analysis demonstrates the effectiveness of our approach. We achieve state-of-the-art results on two challenging datasets, i.e., HollywoodHeads and Brainwash.
What problem does this paper attempt to address?