Inter-Series Transformer: Attending to Products in Time Series Forecasting

Rares Cristian,Pavithra Harsha,Clemente Ocejo,Georgia Perakis,Brian Quanz,Ioannis Spantidakis,Hamza Zerhouni
2024-08-08
Abstract:Time series forecasting is an important task in many fields ranging from supply chain management to weather forecasting. Recently, Transformer neural network architectures have shown promising results in forecasting on common time series benchmark datasets. However, application to supply chain demand forecasting, which can have challenging characteristics such as sparsity and cross-series effects, has been limited. In this work, we explore the application of Transformer-based models to supply chain demand forecasting. In particular, we develop a new Transformer-based forecasting approach using a shared, multi-task per-time series network with an initial component applying attention across time series, to capture interactions and help address sparsity. We provide a case study applying our approach to successfully improve demand prediction for a medical device manufacturing company. To further validate our approach, we also apply it to public demand forecasting datasets as well and demonstrate competitive to superior performance compared to a variety of baseline and state-of-the-art forecast methods across the private and public datasets.
Machine Learning,Artificial Intelligence
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the challenges in supply chain demand forecasting, especially the demand forecasting of medical device manufacturing companies. Specifically, the paper focuses on how to use the Transformer model to improve supply chain demand forecasting in order to address the following challenges: 1. **Sparsity**: At the fine - grained product - location level, sales observation data may be very sparse. 2. **Cross - product effect**: A change in the quantity demanded of one product may affect the quantity demanded of other products. 3. **Changes in time - series sets**: As time passes, new products are continuously added and old products are phased out, resulting in changes in the time - series sets. To address these challenges, the authors propose a new Transformer model variant, called the Inter - Series Transformer. This model improves supply chain demand forecasting in the following ways: - **Self - attention mechanism across time - series**: Apply the self - attention mechanism in the initial stage of the model to capture the interaction relationships between different time - series, thus helping to solve the sparsity problem. - **Shared multi - task time - series network**: Independently apply a shared Transformer network to each time - series and train it in a multi - task setting. This helps to avoid over - fitting and can be applied when the time - series set changes over time. Through these methods, the authors aim to achieve the best combination of multivariate modeling and multi - task modeling per time - series, thereby improving the accuracy of prediction. The paper also verifies the effectiveness of this method through actual case studies and experiments on public data sets.