STC-ViT: Spatio Temporal Continuous Vision Transformer for Weather Forecasting

Hira Saleem,Flora Salim,Cormac Purcell
2024-10-31
Abstract:Operational weather forecasting system relies on computationally expensive physics-based models. Recently, transformer based models have shown remarkable potential in weather forecasting achieving state-of-the-art results. However, transformers are discrete and physics-agnostic models which limit their ability to learn the continuous spatio-temporal features of the dynamical weather system. We address this issue with STC-ViT, a Spatio-Temporal Continuous Vision Transformer for weather forecasting. STC-ViT incorporates the continuous time Neural ODE layers with multi-head attention mechanism to learn the continuous weather evolution over time. The attention mechanism is encoded as a differentiable function in the transformer architecture to model the complex weather dynamics. Further, we define a customised physics informed loss for STC-ViT which penalize the model's predictions for deviating away from physical laws. We evaluate STC-ViT against operational Numerical Weather Prediction (NWP) model and several deep learning based weather forecasting models. STC-ViT, trained on 1.5-degree 6-hourly data, demonstrates computational efficiency and competitive performance compared to state-of-the-art data-driven models trained on higher-resolution data for global forecasting.
Machine Learning
What problem does this paper attempt to address?
### What problems does this paper attempt to solve? This paper aims to solve the trade - off problem between computational efficiency and accuracy in traditional physics - based numerical weather prediction (NWP) models. Specifically, the authors propose a new method named **STC - ViT** (Spatio - Temporal Continuous Vision Transformer) to improve the continuous spatio - temporal modeling ability of the weather forecasting system. #### Main problems: 1. **Computational efficiency and accuracy**: Traditional physics - based numerical weather forecasting systems are accurate but extremely computationally costly and have cumulative errors, requiring a large amount of computational resources (Palmer et al., 2005; Andersson, 2022). Therefore, there is an urgent need for a method that can ensure accuracy while improving computational efficiency. 2. **Limitations of discrete models**: Existing deep - learning models based on Transformers perform well in weather forecasting, but they are essentially discrete and ignore the basic physical laws of the atmosphere, which limits their ability to learn continuous spatio - temporal features (Fonseca et al., 2023). 3. **Modeling of continuous spatio - temporal dynamics**: Weather data has significant spatio - temporal continuity and dynamic evolution characteristics, which pose challenges for generating accurate forecasts. Existing discrete models have difficulty capturing these complex spatio - temporal changes. #### Solutions: To solve the above problems, the authors propose **STC - ViT**, and its main innovations include: - **Continuous spatio - temporal attention mechanism**: By introducing the continuous - time neural ordinary differential equation (Neural ODE) layer and the multi - head attention mechanism, STC - ViT can learn the continuous evolution process of the weather system, thereby better capturing spatio - temporal continuity. - **Physics - constrained loss function**: To ensure that the model predictions conform to the atmospheric physical laws, the authors design a customized physics - informed loss function, which constrains the model predictions through soft penalty terms to make them closer to the real physical behavior. - **Pre - processing step**: By calculating the time derivatives of weather variables as a pre - processing step, the effect of feature extraction is further enhanced. Through these improvements, STC - ViT not only improves computational efficiency but also shows performance comparable to existing state - of - the - art data - driven models in global weather forecasting, especially when trained on lower - resolution data. ### Summary The main goal of this paper is to develop a weather forecasting system that can operate efficiently and predict accurately. By combining continuous spatio - temporal modeling and physical constraints, STC - ViT improves computational efficiency while ensuring the physical consistency of the prediction results.