Self-Supervised Spatiotemporal Imputation Model for Highly Sparse Chl-a Data via Fusing Multisource Satellite Data
Shuyu Wang,Wengen Li,Jihong Guan,Xiwei Liu,Yichao Zhang,Shuigeng Zhou
DOI: https://doi.org/10.1109/tgrs.2024.3440912
IF: 8.2
2024-08-25
IEEE Transactions on Geoscience and Remote Sensing
Abstract:Monitoring Chlorophyll-a (Chl-a) concentration in ocean is of considerable significance for early warning of algae disasters, marine ecological environment protection, etc. However, due to various uncontrollable factors such as cloud cover and thick aerosols, the missing rate of observed Chl-a data is quite high, which seriously hinders its applications. The existing data imputation methods are mostly applicable to Chl-a data with a low missing rate, and perform much worse when the missing rate reaches 0.7 or above. To address this issue, we proposed a self-supervised spatiotemporal imputation (S2-STI) model for highly sparse Chl-a data imputation by fusing multisource satellite data. First, to obtain comprehensive information about the missed Chl-a data, we designed a multisource sparse data fusion (MSDF) module to fuse multisource satellite data, including Chl-a data, sea surface temperature (SST) data, and photosynthetically available radiation data. MSDF constructs a spatialtemporal graph network to learn high-quality spatiotemporal representations of SST and photosynthetically active radiation (PAR) data, and fuses the learned representations with Chl-a data to enrich the information for imputation. Then, we developed a generation-based data imputation (GDI) module to model the distribution of Chl-a data based on the outputs of MSDF. Considering the high data sparsity, we designed a self-supervised training strategy to train S2-STI in the absence of ground truth. Finally, we leverage the data generated by the GDI module in the trained S2-STI model to fill in missing values in the sparse Chl-a data. Experiments on real datasets show that S2-STI achieves much better performance than the existing data imputation methods. Specifically, for the Chl-a data with a missing rate of 0.9, S2-STI improves by at least 12% in terms of masked autoencoder (MAE) error when compared with the strong baseline methods.
imaging science & photographic technology,remote sensing,engineering, electrical & electronic,geochemistry & geophysics