A Multi-scale Spatiotemporal Attention Network for Ground-Based Remote Sensing Cloud Image Sequence Prediction

Feng Zhang,Yingying Cheng,Qiang Hua,Chunru Dong,Yong Zhang,Tingdong Wu
DOI: https://doi.org/10.1109/tgrs.2024.3485581
IF: 8.2
2024-01-01
IEEE Transactions on Geoscience and Remote Sensing
Abstract:Ground-based cloud image sequence prediction provides valuable insights into cloud motion and meteorological conditions, which are essential for photovoltaic power generation systems. However, most existing models are recurrent-based, which are problematic to provide satisfactory forecasting results with rapid speed due to these recurrent-based models do not support parallel inference and usually suffer from slow inference speed. To address the issues, a novel recurrent-free deep learning-based framework, called Multi-scale Spatiotemporal Attention Network (MSTANet), is proposed in this study. The MSTANet leverages a multi-scale spatiotemporal attention module to extract the multi-scale, nonlinear spatiotemporal dependencies from cloud image sequences and utilizes a multi-scale time attention module to reinforce the temporal dependencies by capturing the high- and low-frequency spatiotemporal fluctuations of clouds. To mitigate the ghosting effects that are prevalent in spatiotemporal prediction tasks, a gated aggregation unit is introduced to filter the useful context information by integrating the historical information with the updated predictions. Additionally, a multi-order differential divergence regularization term is introduced into loss function to improve the model’s performance by encouraging MSTANet to focus on the evolving trends of neighborhood of clouds. Experimental results show that the proposed MSTANet outperforms the SOTA prediction methods. It reduces 46% parameters and MSE by 4.31% on Moving Mnist dataset; and reduces 22% parameters with 1.82% performance improvement on Folsom dataset compared to the baseline TAU. The codes are available at https://github.com/Csorasky/MSTANet.
What problem does this paper attempt to address?