A transferred spatio-temporal deep model based on multi-LSTM auto-encoder for air pollution time series missing value imputation

Xiaoxia Zhang,Pengcheng Zhou
DOI: https://doi.org/10.1016/j.future.2024.03.015
IF: 7.307
2024-03-17
Future Generation Computer Systems
Abstract:Air pollution is one of the most severe problems facing the world. Research on air quality prediction and analysis of influencing factors also continues to grow. When conducting this research, valid, authentic, and high-quality air pollution data are necessary to obtain reasonable results. However, Missing values are unavoidable in multivariate time series due to multiple causes, such as sensor and communication failure. Most previous algorithms on missing data cannot effectively pay attention to air pollution's temporal and spatial mechanism, handle multiple missing patterns, or deal with high missing rates sequences. This paper proposes a new deep spatiotemporal imputation methodology to address this problem effectively, namely transferred Multiple LSTM based deep auto-encoder (TMLSTM-AE). Our idea is intuitive: train an auto-encoder to estimate the missing values. It uses spatial and time series information to fill in single missing, multiple missing, block missing, and long-interval consecutive missing in air quality data. To verify the effectiveness and priority of the proposed model, we conducted a case study in a city in Shaanxi, China. Long-interval consecutive missing and different missing patterns PM2.5 data are filled. The results indicate that the model proposed in this paper performs well and outperforms existing models for different missing patterns and long-interval consecutive missing.
computer science, theory & methods
What problem does this paper attempt to address?