Pre-trained Transformer Uncovers Meaningful Patterns in Human Mobility Data

Alameen Najjar
2024-06-06
Abstract:We empirically demonstrate that a transformer pre-trained on country-scale unlabeled human mobility data learns embeddings capable, through fine-tuning, of developing a deep understanding of the target geography and its corresponding mobility patterns. Utilizing an adaptation framework, we evaluate the performance of our pre-trained embeddings in encapsulating a broad spectrum of concepts directly and indirectly related to human mobility. This includes basic notions, such as geographic location and distance, and extends to more complex constructs, such as administrative divisions and land cover. Our extensive empirical analysis reveals a substantial performance boost gained from pre-training, reaching up to 38% in tasks such as tree-cover regression. We attribute this result to the ability of the pre-training to uncover meaningful patterns hidden in the raw data, beneficial for modeling relevant high-level concepts. The pre-trained embeddings emerge as robust representations of regions and trajectories, potentially valuable for a wide range of downstream applications.
Computers and Society,Artificial Intelligence,Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: through the Transformer model pre - trained on large - scale unlabeled human mobility data, can it learn meaningful mobility patterns and develop a deep understanding of the target geographical area and its corresponding mobility patterns by fine - tuning these pre - trained embeddings? ### Specific Problem Description 1. **Understanding of Human Mobility Data** - The paper explores whether the Transformer model pre - trained on large - scale unlabeled human mobility data can capture meaningful mobility patterns. - Similar to word embeddings in natural language processing (NLP), researchers hope to verify whether these pre - trained models can capture patterns in human mobility data as word embeddings capture language structures and semantics. 2. **Modeling of High - Dimensional Geographical Attributes** - Researchers evaluated the performance of pre - trained embeddings in modeling a series of high - dimensional geographical attributes directly or indirectly related to human mobility, including basic concepts such as geographical location and distance, as well as more complex concepts such as administrative divisions and land cover. - Through extensive empirical analysis, researchers demonstrated a significant performance improvement brought by pre - training, for example, achieving a 38% performance improvement in the canopy regression task. 3. **Potential for Downstream Applications** - As robust representations of regions and trajectories, pre - trained embeddings may be valuable for a wide range of applications, such as demography, land use, and transportation planning. ### Main Contributions - Provide empirical evidence indicating that self - supervised pre - trained Transformers can reveal patterns related to modeling various geospatial concepts (such as population size and travel behavior) on national - scale human mobility data. - This ability has not been reported in previous studies, filling this gap in the field. ### Method Overview 1. **Data** - More than 53 billion GPS data points collected by Rakuten were used to generate approximately 17 million trajectories. 2. **Model Architecture** - BERT was adopted as the basic architecture and adapted to trajectory data by introducing a spatial tokenization step. 3. **Pre - training Tasks** - Masked Trajectory Modeling (MTM) was utilized as a self - supervised pre - training task, similar to Masked Language Modeling (MLM) in NLP. 4. **Adaptation Methods** - The capabilities of pre - trained embeddings were evaluated through three methods: fine - tuning, few - shot learning, and zero - shot learning. ### Experimental Results - Experimental results on multiple tasks show that pre - training significantly improves the performance of the model, especially in prediction tasks related to geographical location, demography, administrative divisions, and land cover. ### Conclusion The paper empirically proves that pre - trained Transformers can capture meaningful patterns in human mobility data, and these patterns can be further applied to the modeling of multiple geospatial attributes through fine - tuning, providing new directions and ideas for future GeoAI research.