SERT: A Transfomer Based Model for Spatio-Temporal Sensor Data with Missing Values for Environmental Monitoring

Amin Shoari Nejad,Rocío Alaiz-Rodríguez,Gerard D. McCarthy,Brian Kelleher,Anthony Grey,Andrew Parnell
2023-06-09
Abstract:Environmental monitoring is crucial to our understanding of climate change, biodiversity loss and pollution. The availability of large-scale spatio-temporal data from sources such as sensors and satellites allows us to develop sophisticated models for forecasting and understanding key drivers. However, the data collected from sensors often contain missing values due to faulty equipment or maintenance issues. The missing values rarely occur simultaneously leading to data that are multivariate misaligned sparse time series. We propose two models that are capable of performing multivariate spatio-temporal forecasting while handling missing data naturally without the need for imputation. The first model is a transformer-based model, which we name SERT (Spatio-temporal Encoder Representations from Transformers). The second is a simpler model named SST-ANN (Sparse Spatio-Temporal Artificial Neural Network) which is capable of providing interpretable results. We conduct extensive experiments on two different datasets for multivariate spatio-temporal forecasting and show that our models have competitive or superior performance to those at the state-of-the-art.
Machine Learning,Artificial Intelligence
What problem does this paper attempt to address?
The paper primarily focuses on addressing the issue of spatiotemporal data prediction in environmental monitoring, particularly dealing with sensor data that contains missing values. Specifically, the paper proposes two models to handle such data: 1. **SERT (Spatiotemporal Encoder Representation from Transformer)**: This is a model based on the Transformer architecture, capable of multivariate spatiotemporal prediction and can naturally handle missing data without the need for data imputation. 2. **SST-ANN (Sparse Spatiotemporal Artificial Neural Network)**: This is a simpler, more interpretable model that can also handle missing data. Although it may be slightly less accurate than SERT, it has the advantage of faster computation speed and can provide insights into how the prediction results are derived. Both models aim to address the key challenge in spatiotemporal data prediction—how to effectively handle data missing due to sensor failures or maintenance issues. Additionally, the paper proposes a method for handling positional information in the input data and designs a mask loss function for training the models to address the issue of missing values in the output data. Through experimental evaluation on both simulated and real-world datasets, the study shows that these models can effectively cope with different levels of sparsity and demonstrate good performance in practical applications (such as environmental monitoring data from Dublin Bay). Notably, the SERT model performs best in experiments with 7-hour ahead predictions, while the SST-ANN model provides interpretability of the prediction results.