A Generalized Earley Parser for Human Activity Parsing and Prediction.

Siyuan Qi,Baoxiong Jia,Siyuan Huang,Ping Wei,Song-Chun Zhu
DOI: https://doi.org/10.1109/tpami.2020.2976971
IF: 23.6
2020-01-01
IEEE Transactions on Pattern Analysis and Machine Intelligence
Abstract:Detection, parsing, and future predictions on sequence data (e.g., videos) require the algorithms to capture non-Markovian and compositional properties of high-level semantics. Context-free grammars are natural choices to capture such properties, but traditional grammar parsers (e.g., Earley parser) only take symbolic sentences as inputs. In this paper, we generalize the Earley parser to parse sequence data which is neither segmented nor labeled. Given the output of an arbitrary probabilistic classifier, this generalized Earley parser finds the optimal segmentation and labels in the language defined by the input grammar. Based on the parsing results, it makes top-down future predictions. The proposed method is generic, principled, and widely applicable. Experiment results clearly show the benefit of our method for both human activity parsing and prediction on three video datasets.
What problem does this paper attempt to address?