Time Series Forecasting with LLMs: Understanding and Enhancing Model Capabilities

Hua Tang,Chong Zhang,Mingyu Jin,Qinkai Yu,Zhenting Wang,Xiaobo Jin,Yongfeng Zhang,Mengnan Du
2024-08-10
Abstract:Large language models (LLMs) have been applied in many fields and have developed rapidly in recent years. As a classic machine learning task, time series forecasting has recently been boosted by LLMs. Recent works treat large language models as \emph{zero-shot} time series reasoners without further fine-tuning, which achieves remarkable performance. However, there are some unexplored research problems when applying LLMs for time series forecasting under the zero-shot setting. For instance, the LLMs' preferences for the input time series are less understood. In this paper, by comparing LLMs with traditional time series forecasting models, we observe many interesting properties of LLMs in the context of time series forecasting. First, our study shows that LLMs perform well in predicting time series with clear patterns and trends, but face challenges with datasets lacking periodicity. This observation can be explained by the ability of LLMs to recognize the underlying period within datasets, which is supported by our experiments. In addition, the input strategy is investigated, and it is found that incorporating external knowledge and adopting natural language paraphrases substantially improve the predictive performance of LLMs for time series. Overall, our study contributes insight into LLMs' advantages and limitations in time series forecasting under different conditions.
Computation and Language
What problem does this paper attempt to address?
### Problems the Paper Attempts to Solve This paper aims to explore the application of large language models (LLMs) in time series forecasting and understand their capabilities and limitations in a zero-shot setting. Specifically, the paper focuses on the following aspects: 1. **LLMs' Preference for Input Time Series**: - The study finds that LLMs perform better when dealing with datasets with clear trends and seasonality but face challenges with datasets lacking periodicity. - Through experiments, the authors observe that LLMs can identify underlying cycles in the datasets, which explains their good performance on datasets with high trend or seasonality intensity. 2. **Impact of Input Strategies**: - The research shows that incorporating external knowledge into input prompts and converting numerical sequences into natural language format can significantly improve LLMs' time series forecasting performance. - These methods help the model better capture the periodic characteristics of time series data, rather than just relying on the tail information of the time series. 3. **Handling Multi-Period Datasets**: - The authors find that LLMs' performance deteriorates when there are multiple periods in the dataset, possibly because they struggle to capture the different cycles in the dataset. 4. **Improving Model Performance**: - To further enhance model performance, the authors propose two simple methods: incorporating external human knowledge into input prompts and converting numerical sequences into natural language form. Both methods significantly improve the model's forecasting performance. ### Main Contributions 1. **Exploring LLMs' Preference for Input Sequences**: - The authors find through experiments that LLMs perform better when handling time series data with high trend and seasonality intensity, without additional fine-tuning. - By running multiple times, LLMs can effectively identify periodic patterns in the datasets, explaining why they perform well on datasets with high seasonality intensity. 2. **Proposing Methods to Improve Model Performance**: - The authors propose two methods to enhance model performance: incorporating external human knowledge into input prompts and converting numerical sequences into natural language form. Both methods significantly improve the model's forecasting performance. ### Experimental Methods - **Datasets**: The authors use various real and synthetic datasets, including the Darts benchmark dataset and other commonly used datasets. - **Evaluation Metrics**: Mean Squared Error (MSE), Mean Absolute Error (MAE), and Mean Absolute Percentage Error (MAPE) are used to evaluate model performance. - **Experimental Design**: Counterfactual examples are created by adding Gaussian noise to assess the impact of different input segments on model performance. ### Key Findings - **LLMs' Preference for High Trend and Seasonality Intensity Datasets**: LLMs perform better when handling time series data with high trend and seasonality intensity. - **Challenges with Multi-Period Datasets**: As the number of periods in the dataset increases, model performance declines. - **Impact of Input Strategies**: Applying external knowledge and natural language conversion to input prompts can significantly improve model performance. ### Conclusion Through systematic research, this paper reveals the advantages and limitations of LLMs in time series forecasting and proposes effective improvement methods, providing valuable references for future research.