Multi-resolution Time-Series Transformer for Long-term Forecasting

Yitian Zhang,Liheng Ma,Soumyasundar Pal,Yingxue Zhang,Mark Coates
2024-03-22
Abstract:The performance of transformers for time-series forecasting has improved significantly. Recent architectures learn complex temporal patterns by segmenting a time-series into patches and using the patches as tokens. The patch size controls the ability of transformers to learn the temporal patterns at different frequencies: shorter patches are effective for learning localized, high-frequency patterns, whereas mining long-term seasonalities and trends requires longer patches. Inspired by this observation, we propose a novel framework, Multi-resolution Time-Series Transformer (MTST), which consists of a multi-branch architecture for simultaneous modeling of diverse temporal patterns at different resolutions. In contrast to many existing time-series transformers, we employ relative positional encoding, which is better suited for extracting periodic components at different scales. Extensive experiments on several real-world datasets demonstrate the effectiveness of MTST in comparison to state-of-the-art forecasting techniques.
Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: **the long - term prediction problem of multivariate time series**. Specifically, the author points out that current time - series prediction models (especially Transformer - based models) have some limitations, including: 1. **Insufficient sensitivity to time order**: Most existing time - series Transformers (TSTs) are not sensitive to the time order at the timestamp level and may even be surpassed by simple linear models (Zeng et al., 2023). 2. **Inability to effectively capture multi - scale features**: Although patch - based time - series Transformers have a significant performance improvement, they usually cannot explicitly handle multi - scale features (Nie et al., 2023). This means that these models have difficulty in simultaneously capturing short - term and long - term time patterns. To solve these problems, the author proposes a new framework - **Multi - Resolution Time - Series Transformer (MTST)**. This framework improves existing methods in the following ways: - **Multi - branch architecture**: Each branch uses patches of different sizes to simultaneously model time patterns of different frequencies. Small patches are used to capture high - frequency, local patterns, while large patches are used to capture low - frequency, long - term trends. - **Relative Position Encoding (RPE)**: Different from the traditional absolute position encoding, MTST uses relative position encoding, which is more suitable for extracting periodic components and can better capture the time - dependent relationships in time series. Through these improvements, MTST can show better long - term prediction performance on multiple real - world datasets than existing methods.