Multi-scale Transformer Pyramid Networks for Multivariate Time Series Forecasting

Yifan Zhang,Rui Wu,Sergiu M. Dascalu,Frederick C. Harris Jr
DOI: https://doi.org/10.1109/ACCESS.2024.3357693
2023-08-23
Abstract:Multivariate Time Series (MTS) forecasting involves modeling temporal dependencies within historical records. Transformers have demonstrated remarkable performance in MTS forecasting due to their capability to capture long-term dependencies. However, prior work has been confined to modeling temporal dependencies at either a fixed scale or multiple scales that exponentially increase (most with base 2). This limitation hinders their effectiveness in capturing diverse seasonalities, such as hourly and daily patterns. In this paper, we introduce a dimension invariant embedding technique that captures short-term temporal dependencies and projects MTS data into a higher-dimensional space, while preserving the dimensions of time steps and variables in MTS data. Furthermore, we present a novel Multi-scale Transformer Pyramid Network (MTPNet), specifically designed to effectively capture temporal dependencies at multiple unconstrained scales. The predictions are inferred from multi-scale latent representations obtained from transformers at various scales. Extensive experiments on nine benchmark datasets demonstrate that the proposed MTPNet outperforms recent state-of-the-art methods.
Machine Learning
What problem does this paper attempt to address?
The paper aims to address a key issue in Multivariate Time Series (MTS) forecasting: how to effectively capture time dependencies at different scales. Specifically, existing methods typically model time dependencies at either a fixed scale or multiple exponentially growing scales (mostly base 2), which limits their effectiveness in capturing diverse seasonal patterns (such as hourly and daily patterns). To overcome this limitation, the paper proposes a dimension-invariant embedding technique and a novel Multi-scale Transformer Pyramid Network (MTPNet). This network can effectively capture time dependencies at multiple unrestricted scales and obtain multi-scale latent representations from different scales of transformers for prediction. Experimental results show that MTPNet outperforms recent state-of-the-art methods on 9 benchmark datasets.