SSL4EO-S12: A large-scale multimodal, multitemporal dataset for self-supervised learning in Earth observation [Software and Data Sets]
Yi Wang,Nassim Ait Ali Braham,Zhitong Xiong,Chenying Liu,Conrad M. Albrecht,Xiao Xiang Zhu
DOI: https://doi.org/10.1109/mgrs.2023.3281651
IF: 14.6
2023-09-01
IEEE Geoscience and Remote Sensing Magazine
Abstract:Self-supervised pretraining bears the potential to generate expressive representations from large-scale Earth observation (EO) data without human annotation. However, most existing pretraining in the field is based on ImageNet or medium-sized, labeled remote sensing (RS) datasets. In this article, we share an unlabeled dataset Self-Supervised Learning for Earth Observation-Sentinel-1/2 (SSL4EO-S12) to assemble a large-scale, global, multimodal, and multiseasonal corpus of satellite imagery. We demonstrate SSL4EO-S12 to succeed in self-supervised pretraining for a set of representative methods: momentum contrast (MoCo), self-distillation with no labels (DINO), masked autoencoders (MAE), and data2vec, and multiple downstream applications, including scene classification, semantic segmentation, and change detection. Our benchmark results prove the effectiveness of SSL4EO-S12 compared to existing datasets. The dataset, related source code, and pretrained models are available at https://github.com/zhu-xlab/SSL4EO-S12.
geochemistry & geophysics,remote sensing,imaging science & photographic technology