Human Activity Learning and Segmentation using Partially Hidden Discriminative Models

Truyen Tran,Hung Bui,Svetha Venkatesh
DOI: https://doi.org/10.48550/arXiv.1408.3081
2014-08-06
Abstract:Learning and understanding the typical patterns in the daily activities and routines of people from low-level sensory data is an important problem in many application domains such as building smart environments, or providing intelligent assistance. Traditional approaches to this problem typically rely on supervised learning and generative models such as the hidden Markov models and its extensions. While activity data can be readily acquired from pervasive sensors, e.g. in smart environments, providing manual labels to support supervised training is often extremely expensive. In this paper, we propose a new approach based on semi-supervised training of partially hidden discriminative models such as the conditional random field (CRF) and the maximum entropy Markov model (MEMM). We show that these models allow us to incorporate both labeled and unlabeled data for learning, and at the same time, provide us with the flexibility and accuracy of the discriminative framework. Our experimental results in the video surveillance domain illustrate that these models can perform better than their generative counterpart, the partially hidden Markov model, even when a substantial amount of labels are unavailable.
Machine Learning,Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to learn and understand typical patterns in people's daily activities and routine behaviors from low - level sensor data. This is very important in many application scenarios, such as building intelligent environments or providing intelligent assistance. Traditional methods usually rely on supervised learning and generative models (such as Hidden Markov Models and their extensions). However, although activity data can be easily obtained through ubiquitous sensors, it is often very expensive to provide manual labels to support supervised training. Therefore, this paper proposes a new semi - supervised training method based on partially hidden discriminative models (such as Conditional Random Fields (CRF) and Maximum Entropy Markov Models (MEMM)). These models allow learning with both labeled and unlabeled data and provide the flexibility and accuracy of the discriminative framework. Expressed in formulas, this paper aims to optimize the following problems: - **Objective function**: Maximize the log - likelihood with a penalty term \[ \Lambda(\lambda) = L(\lambda)-\frac{1}{2\sigma^{2}}\|\lambda\|^{2} \] where \( L(\lambda)=\log p(v|x; \lambda) \), \( v \) is the visible label, \( x \) is the observed data, \( \lambda \) is the model parameter, and \( \sigma \) is the regularization parameter. - **Expectation - Maximization (EM) algorithm**: - **E - step**: Calculate \[ Q(\lambda_{j}, \lambda)=\sum_{h} p(h|v, x; \lambda_{j})\log p(h, v|x) \] - **M - step**: Maximize \[ Q(\lambda_{j}, \lambda)-\frac{1}{2\sigma^{2}}\|\lambda\|^{2} \] Through this method, the paper shows that the performance of these discriminative models in the field of video surveillance is better than their generative counterparts (partially hidden Markov models), even when a large number of labels are missing. In summary, this paper mainly solves the problem of how to efficiently learn and segment human activities from low - level sensor data using partially hidden discriminative models when the labeled data is limited.