Large Models for Time Series and Spatio-Temporal Data: A Survey and Outlook

Ming Jin,Qingsong Wen,Yuxuan Liang,Chaoli Zhang,Siqiao Xue,Xue Wang,James Zhang,Yi Wang,Haifeng Chen,Xiaoli Li,Shirui Pan,Vincent S. Tseng,Yu Zheng,Lei Chen,Hui Xiong
2023-10-20
Abstract:Temporal data, notably time series and spatio-temporal data, are prevalent in real-world applications. They capture dynamic system measurements and are produced in vast quantities by both physical and virtual sensors. Analyzing these data types is vital to harnessing the rich information they encompass and thus benefits a wide range of downstream tasks. Recent advances in large language and other foundational models have spurred increased use of these models in time series and spatio-temporal data mining. Such methodologies not only enable enhanced pattern recognition and reasoning across diverse domains but also lay the groundwork for artificial general intelligence capable of comprehending and processing common temporal data. In this survey, we offer a comprehensive and up-to-date review of large models tailored (or adapted) for time series and spatio-temporal data, spanning four key facets: data types, model categories, model scopes, and application areas/tasks. Our objective is to equip practitioners with the knowledge to develop applications and further research in this underexplored domain. We primarily categorize the existing literature into two major clusters: large models for time series analysis (LM4TS) and spatio-temporal data mining (LM4STD). On this basis, we further classify research based on model scopes (i.e., general vs. domain-specific) and application areas/tasks. We also provide a comprehensive collection of pertinent resources, including datasets, model assets, and useful tools, categorized by mainstream applications. This survey coalesces the latest strides in large model-centric research on time series and spatio-temporal data, underscoring the solid foundations, current advances, practical applications, abundant resources, and future research opportunities.
Machine Learning,Artificial Intelligence
What problem does this paper attempt to address?
The problems that this paper attempts to solve are: How to effectively use large - scale models (such as large - language models and pre - trained foundation models) to analyze time - series and spatio - temporal data. Specifically, the paper focuses on the following aspects: 1. **Enhancing Pattern Recognition and Reasoning Abilities**: By using large - scale models, more powerful pattern recognition and reasoning abilities can be achieved in different fields, so as to better process time - series and spatio - temporal data. 2. **The Foundation of Artificial General Intelligence**: These models can not only understand and process common time - series data, but also lay the foundation for constructing artificial general intelligence that can understand and process multiple types of data. 3. **Limitations of Existing Methods**: Traditional analysis methods mainly rely on statistical models. Although the emergence of deep learning has promoted the research community to explore more powerful data - driven models, most models are still small - scale and specific to certain tasks, lacking the ability to obtain comprehensive semantic and knowledge representations from large - scale data. 4. **Lack of Large - Scale Datasets**: Although there have been successful attempts in some fields, in many cases, the lack of large - scale datasets is still a major obstacle. 5. **Potential for Cross - Domain Applications**: The paper explores the application potential of large - scale models in multiple fields, including but not limited to earth science, transportation, energy, healthcare, environment, and finance, and shows their practical applications and future research directions in these fields. To address the above problems, the paper provides a comprehensive review of the latest progress of large - scale models in time - series and spatio - temporal data analysis, covering four key aspects: data types, model categories, model scopes, and application domains/tasks. In addition, the paper also provides a rich collection of resources, including datasets, model assets, and useful tools, to help researchers and practitioners further explore this field. ### Formula Examples When discussing time - series and spatio - temporal data, the paper mentions some definitions, for example: - **Univariate Time - Series**: \[ x=\{x_1, x_2,\cdots, x_T\}\in\mathbb{R}^T \] where \(x_t\in\mathbb{R}\) represents the time - series value at time \(t\). - **Multivariate Time - Series**: \[ X =\{\mathbf{x}_1,\mathbf{x}_2,\cdots,\mathbf{x}_T\}\in\mathbb{R}^{T\times D} \] where \(\mathbf{x}_t\in\mathbb{R}^D\) represents the time - series value along \(D\) channels at time \(t\). - **Spatio - Temporal Graph**: \[ G=\{G_1, G_2,\cdots, G_T\} \] where \(G_t=(V_t, E_t)\) represents a static graph snapshot at time \(t\), and \(V_t\) and \(E_t\) are the sets of nodes and edges respectively. Through these definitions and formulas, the paper describes in detail the basic structures and characteristics of time - series and spatio - temporal data, providing a theoretical basis for further research.