Pedestrian Attribute Recognition by Joint Visual-Semantic Reasoning and Knowledge Distillation

Qiaozhe Li,Xin Zhao,Ran He,Kaiqi Huang
DOI: https://doi.org/10.24963/ijcai.2019/117
2019-01-01
Abstract:Pedestrian attribute recognition in surveillance is a challenging task in computer vision due to significant pose variation, viewpoint change and poor image quality. To achieve effective recognition, this paper presents a graph-based global reasoning framework to jointly model potential visual-semantic relations of attributes and distill auxiliary human parsing knowledge to guide the relational learning. The reasoning framework models attribute groups on a graph and learns a projection function to adaptively assign local visual features to the nodes of the graph. After feature projection, graph convolution is utilized to perform global reasoning between the attribute groups to model their mutual dependencies. Then, the learned node features are projected back to visual space to facilitate knowledge transfer. An additional regularization term is proposed by distilling human parsing knowledge from a pre-trained teacher model to enhance feature representations. The proposed framework is verified on three large scale pedestrian attribute datasets including PETA, RAP, and PA- 100k. Experiments show that our method achieves state-of-the-art results.
What problem does this paper attempt to address?