From Raw Footprints to Personal Interests: Bridging the Semantic Gap Via Trip Intention Aggregation

Long Guo,Dongxiang Zhang,Huayu Wu,Bin Cui,Kian-Lee Tan
DOI: https://doi.org/10.1109/icde.2017.55
2017-01-01
Abstract:User-generated trajectories (UGT), such as GPS footprints from wearable devices or travel records from bus companies, capture rich information of human mobility and urban dynamics in the offline world. In this paper, our objective is to enrich these raw footprints and discover the users' personal interests by utilizing the semantic information contained in the spatial-and temporal-aware user-generated contents (STUGC) published in the online world.We design a novel probabilistic framework named CO2 to connect the offline world with the online world in order to discover the users' interests directly from their raw footprints in UGT. In particular, we first propose a latent probabilistic generative model named STLDA to infer the intention attached with each trip, and then aggregate the extracted trip intentions to discover the users' personal interests.To tackle the inherent sparsity and noisiness problems of the tags in STUGC, STLDA considers the inner correlation between tags (i.e., semantic, spatial and temporal correlation) on the topic-level. To evaluate the effectiveness of CO2, we utilize a dataset containing three months of data with 5.3 billion bus records and a Twitter dataset with 1.5 million tweets published in 6 months in Singapore as a case study. Experimental results on these two real-world datasets show that CO2 is effective in discovering user interests and improves the precision of the state-of-the-art method by 280%. In addition, we also conduct a questionnaire survey in Singapore to evaluate the effectiveness of CO2. The results further validate the superiority of CO2.
What problem does this paper attempt to address?