Characterizing Activity Sequences Using Profile Hidden Markov Models.
Feng Liu,Davy Janssens,JianXun Cui,Geert Wets,Mario Cools
DOI: https://doi.org/10.1016/j.eswa.2015.02.057
IF: 8.5
2015-01-01
Expert Systems with Applications
Abstract:In literature, activity sequences, generated from activity-travel diaries, have been analyzed and classified into clusters based on the composition and ordering of the activities using Sequence Alignment Methods (SAM). However, using these methods, only the frequent activities in each cluster are extracted and qualitatively described; the infrequent activities and their related travel episodes are disregarded. Thus, to quantify the occurrence probabilities of all the daily activities as well as their sequential orders, we develop a novel process to build multiple alignments of the sequences and subsequently derive profile Hidden Markov Models (pHMMs). This process consists of 4 major steps. First, activity sequences are clustered based on a pre-defined scheme. The frequent activities along with their sequential orders are then identified in each cluster, and they are subsequently used as a template to guide the construction of a multiple alignment of the cluster of sequences. Finally, a pHMM is employed to convert the multiple alignment into a position-specific scoring system, representing the probability of each frequent activity at each important position of the alignment as well as the probabilities of both insertion and deletion of infrequent activities.By applying the derived pHMMs to a set of activity-travel diaries collected in Belgium as well as a group of mobile phone call location data recorded in Switzerland, the potemial and effectiveness of the models in capturing the sequential features of each cluster and distinguishing them from those of other clusters, are demonstrated. The proposed method can also be utilized to improve activity-based transportation model validation and travel survey designs. Furthermore, it offers a wide application in characterizing a group of any related sequences, particularly sequences varying in length and with a high frequency of short sequences that are typically present in human behavior. (C) 2015 Elsevier Ltd. All rights reserved.