Deep Learning for Spatio-Temporal Data Mining: A Survey

Senzhang Wang,Jiannong Cao,Philip S. Yu
DOI: https://doi.org/10.48550/arXiv.1906.04928
2019-06-24
Abstract:With the fast development of various positioning techniques such as Global Position System (GPS), mobile devices and remote sensing, spatio-temporal data has become increasingly available nowadays. Mining valuable knowledge from spatio-temporal data is critically important to many real world applications including human mobility understanding, smart transportation, urban planning, public safety, health care and environmental management. As the number, volume and resolution of spatio-temporal datasets increase rapidly, traditional data mining methods, especially statistics based methods for dealing with such data are becoming overwhelmed. Recently, with the advances of deep learning techniques, deep leaning models such as convolutional neural network (CNN) and recurrent neural network (RNN) have enjoyed considerable success in various machine learning tasks due to their powerful hierarchical feature learning ability in both spatial and temporal domains, and have been widely applied in various spatio-temporal data mining (STDM) tasks such as predictive learning, representation learning, anomaly detection and classification. In this paper, we provide a comprehensive survey on recent progress in applying deep learning techniques for STDM. We first categorize the types of spatio-temporal data and briefly introduce the popular deep learning models that are used in STDM. Then a framework is introduced to show a general pipeline of the utilization of deep learning models for STDM. Next we classify existing literatures based on the types of ST data, the data mining tasks, and the deep learning models, followed by the applications of deep learning for STDM in different domains including transportation, climate science, human mobility, location based social network, crime analysis, and neuroscience. Finally, we conclude the limitations of current research and point out future research directions.
Machine Learning
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is the limitations of traditional spatio - temporal data mining methods when dealing with modern large - scale, high - resolution spatio - temporal data. With the rapid development of the Global Positioning System (GPS), mobile devices and remote sensing technologies, spatio - temporal data has become increasingly rich and complex. Traditional statistical methods and other data mining techniques face challenges when dealing with such data, mainly in the following aspects: 1. **Continuity and complexity of spatio - temporal data**: Spatio - temporal data is usually embedded in continuous space, while traditional data sets (such as transaction data and graph data) are often discrete. In addition, the patterns in spatio - temporal data have both spatial and temporal characteristics, which makes it more complex and difficult for traditional methods to capture the correlations of these data. 2. **The assumption of sample independence does not hold**: Traditional statistical methods usually assume that data samples are generated independently, but when analyzing spatio - temporal data, this assumption often does not hold because spatio - temporal data tends to be highly autocorrelated. 3. **Strong dependence on feature engineering**: Traditional machine learning and data mining techniques rely heavily on manually - designed feature extraction when dealing with spatio - temporal data, which has limited processing ability for spatio - temporal data in natural form. To solve these problems, this paper explores how to use deep learning techniques to mine the value of spatio - temporal data. Deep learning models, such as Convolutional Neural Networks (CNN) and Recurrent Neural Networks (RNN), have achieved remarkable success in spatio - temporal data mining tasks due to their powerful hierarchical feature learning ability. Specifically, the advantages of deep learning include: - **Automatic feature representation learning**: Unlike traditional methods that require hand - designed features, deep learning models can directly learn hierarchical feature representations automatically from the original spatio - temporal data. - **Powerful function approximation ability**: Theoretically, deep learning can approximate any complex nonlinear function and can fit any curve through a multi - layer structure. Therefore, this paper aims to comprehensively review the research progress in applying deep learning techniques to spatio - temporal data mining in recent years, summarize the advantages and disadvantages of existing work, and point out future research directions.