An Unsupervised Approach to Inferring the Localness of People Using Incomplete Geotemporal Online Check-In Data

Chao Huang,Dong Wang,Jun Tao
DOI: https://doi.org/10.1145/3022471
IF: 5
2017-01-01
ACM Transactions on Intelligent Systems and Technology
Abstract:Inferring the localness of people is to classify people who are local residents in a city from people who visit the city by analyzing online check-in points that are contributed by online users. This information is critical for the urban planning, user profiling, and localized recommendation systems. Supervised learning approaches have been developed to infer the location of people in a city by assuming the availability of high-quality training datasets with complete geotemporal information. In this article, we develop an unsupervised model to accurately identify local people in a city by using the incomplete online check-in data that are publicly available. In particular, we develop an incomplete geotemporal expectation maximization (IGT-EM) scheme, which incorporates a set of hidden variables to represent the localness of people and a set of estimation parameters to represent the likelihood of venues to attract local and nonlocal people, respectively. Our solution can accurately classify local people from nonlocal nones without requiring any training data. We also implement a parallel IGT-EM algorithm by leveraging the computing power of a graphic processing unit (GPU) that consists of 2,496 cores. In the evaluation, we compare our new approach with the existing solutions through four real-world case studies using data from the New York City, Chicago, Boston, and Washington, DC. The results show that our approach can identify the local people and significantly outperform the compared baselines in estimation accuracy and execution time.
What problem does this paper attempt to address?