Human Parsing Using Stochastic And-or Grammars and Rich Appearances

Brandon Rothrock,Song-Chun Zhu
DOI: https://doi.org/10.1109/iccvw.2011.6130303
2011-01-01
Abstract:One of the key challenges to human parsing and pose recovery is handling the variability in geometry and appearance of humans in natural scenes. This variability is due to the large number of distinct articulated configurations, clothing, and self-occlusion, as well as unknown lighting and viewpoint. In this paper, we present a stochastic grammar model that represents the body as an articulated assembly of compositional and reconfigurable parts. The reconfigurable aspect allows a compatible part to be substituted with an alternative part with different attributes, such as for clothing appearance or viewpoint foreshortening. Relations within the grammar enforce consistency between part attributes as well as geometry, allowing a richer set of appearance and geometry constraints over conventional articulated models. Part appearances are modeled by a sparse deformable image template that can still richly describe salient part structures. We describe a dynamic programming parsing algorithm for our model, and show competitive pose recovery results against the state-of-art on a challenging dataset.
What problem does this paper attempt to address?