Spatial-Temporal Cross-View Contrastive Pre-training for Check-in Sequence Representation Learning

Letian Gong,Huaiyu Wan,Shengnan Guo,Xiucheng Li,Yan Lin,Erwen Zheng,Tianyi Wang,Zeyu Zhou,Youfang Lin
DOI: https://doi.org/10.1109/TKDE.2024.3434565
2024-07-25
Abstract:The rapid growth of location-based services (LBS) has yielded massive amounts of data on human mobility. Effectively extracting meaningful representations for user-generated check-in sequences is pivotal for facilitating various downstream services. However, the user-generated check-in data are simultaneously influenced by the surrounding objective circumstances and the user's subjective intention. Specifically, the temporal uncertainty and spatial diversity exhibited in check-in data make it difficult to capture the macroscopic spatial-temporal patterns of users and to understand the semantics of user mobility activities. Furthermore, the distinct characteristics of the temporal and spatial information in check-in sequences call for an effective fusion method to incorporate these two types of information. In this paper, we propose a novel Spatial-Temporal Cross-view Contrastive Representation (STCCR) framework for check-in sequence representation learning. Specifically, STCCR addresses the above challenges by employing self-supervision from "spatial topic" and "temporal intention" views, facilitating effective fusion of spatial and temporal information at the semantic level. Besides, STCCR leverages contrastive clustering to uncover users' shared spatial topics from diverse mobility activities, while employing angular momentum contrast to mitigate the impact of temporal uncertainty and noise. We extensively evaluate STCCR on three real-world datasets and demonstrate its superior performance across three downstream tasks.
Machine Learning,Artificial Intelligence
What problem does this paper attempt to address?
The paper mainly addresses the data characteristics of user check-in sequences in Location-Based Services (LBS) and proposes a novel Spatio-Temporal Cross-View Contrastive Representation Learning framework (STCCR) to solve several key issues existing in current methods when handling check-in sequences. 1. **Spatio-Temporal Uncertainty**: Due to the influence of users' subjective intentions and objective environmental factors, the temporal information in check-in data is uncertain, making it difficult to accurately capture users' intentions. For example, while it is possible to predict that a user might go for a meal next, the exact arrival time is hard to determine due to various factors. 2. **Spatial Diversity**: Users' activity locations are highly diverse, and even within similar time periods, the activity locations can be completely different. For instance, on weekdays and weekends, users' activity locations usually revolve around different themes (such as work-related or leisure activities), and even on the same type of day (like two weekdays or two weekends), the specific locations rarely repeat. 3. **Effective Fusion of Spatio-Temporal Information**: Spatial information is discrete and diverse, while temporal information is continuous but uncertain, making it challenging to effectively fuse the two types of information together. To address the above challenges, the paper proposes the STCCR framework, with its main contributions as follows: - **Proposing a novel Spatio-Temporal Cross-View Contrastive Representation Learning framework**, which performs self-supervised learning from the perspectives of spatial themes and temporal intentions, and promotes the effective fusion of spatio-temporal information through a cross-view contrastive strategy. - **Adopting an angular momentum contrastive method** to handle the inherent uncertainty of temporal information, by adding a soft margin to the contrastive learning training, filtering out temporal noise, and thereby better capturing users' temporal intentions. - **Performing contrastive clustering in the spatial dimension**, identifying shared spatial themes by exploring high-level semantic information in check-in sequences, thus overcoming the issue of location diversity. In the experimental section, the paper conducts extensive evaluations on three real-world datasets, validating the superior performance of STCCR on three downstream tasks, including next location prediction, trajectory-user linking, and temporal prediction. These results demonstrate the effectiveness and generalization ability of the proposed model.