Abstract:Human pose estimation via motion tracking systems can be considered as a regression problem within a discriminative framework. It is always a challenging task to model the mapping from observation space to state space because of the high-dimensional characteristic in the multimodal conditional distribution. In order to build the mapping, existing techniques usually involve a large set of training samples in the learning process which are limited in their capability to deal with multimodality. We propose, in this work, a novel online sparse Gaussian Process (GP) regression model to recover 3-D human motion in monocular videos. Particularly, we investigate the fact that for a given test input, its output is mainly determined by the training samples potentially residing in its local neighborhood and defined in the unified input-output space. This leads to a local mixture GP experts system composed of different local GP experts, each of which dominates a mapping behavior with the specific covariance function adapting to a local region. To handle the multimodality, we combine both temporal and spatial information therefore to obtain two categories of local experts. The temporal and spatial experts are integrated into a seamless hybrid system, which is automatically self-initialized and robust for visual tracking of nonlinear human motion. Learning and inference are extremely efficient as all the local experts are defined online within very small neighborhoods. Extensive experiments on two real-world databases, HumanEva and PEAR, demonstrate the effectiveness of our proposed model, which significantly improve the performance of existing models.

Multihuman Tracking Based on a Spatial–Temporal Appearance Match

Track Without Appearance: Learn Box and Tracklet Embedding with Local and Global Motion Patterns for Vehicle Tracking

Robust Visual Tracking Via CAMShift and Structural Local Sparse Appearance Model

Multi-object tracking via discriminative appearance modeling.

Appearance Guidance Attention for Multi-Object Tracking

Visual Tracking by Appearance Modeling and Sparse Representation

Learning Spatio-Appearance Memory Network for High-Performance Visual Tracking

Special Issue on Visual Tracking

Multi-Channel Adaptive Mixture Background Model for Real-time Tracking.

Object Tracking Via Appearance Modeling and Sparse Representation

A Cost Function Approach for Multi-Human Tracking

Object Tracking with Hierarchical Multiview Learning

Exploiting Pair-Wise Constraints Between Parts for Human Tracking

Multi-Person Articulated Tracking With Spatial and Temporal Embeddings

CAMTrack: a combined appearance-motion method for multiple-object tracking

Robust Visual Tracking Based on Hierarchical Appearance Model.

Tracklet Association by Online Target-Specific Metric Learning and Coherent Dynamics Estimation

Tracking People by Predicting 3D Appearance, Location & Pose

Joint Feature-Spatial-Measure Space: A New Approach to Highly Efficient Probabilistic Object Tracking

Human motion tracking by temporal-spatial local gaussian process experts

Object Tracking with Multi-View Support Vector Machines.