A transformer-based Siamese network and an open optical dataset for semantic change detection of remote sensing images

Panli Yuan,Qingzhan Zhao,Xingbiao Zhao,Xuewen Wang,Xuefeng Long,Yuchen Zheng
DOI: https://doi.org/10.1080/17538947.2022.2111470
IF: 4.606
2022-09-14
International Journal of Digital Earth
Abstract:Recent change detection (CD) methods focus on the extraction of deep change semantic features. However, existing methods overlook the fine-grained features and have the poor ability to capture long-range space–time information, which leads to the micro changes missing and the edges of change types smoothing. In this paper, a potential transformer-based semantic change detection (SCD) model, Pyramid-SCDFormer is proposed, which precisely recognizes the small changes and fine edges details of the changes. The SCD model selectively merges different semantic tokens in multi-head self-attention block to obtain multiscale features, which is crucial for extraction information of remote sensing images (RSIs) with multiple changes from different scales. Moreover, we create a well-annotated SCD dataset, Landsat-SCD with unprecedented time series and change types in complex scenarios. Comparing with three Convolutional Neural Network-based, one attention-based, and two transformer-based networks, experimental results demonstrate that the Pyramid-SCDFormer stably outperforms the existing state-of-the-art CD models and obtains an improvement in MIoU/F1 of 1.11/0.76%, 0.57/0.50%, and 8.75/8.59% on the LEVIR-CD, WHU_CD, and Landsat-SCD dataset respectively. For change classes proportion less than 1%, the proposed model improves the MIoU by 7.17–19.53% on Landsat-SCD dataset. The recognition performance for small-scale and fine edges of change types has greatly improved.
geography, physical,remote sensing
What problem does this paper attempt to address?
### Problems Addressed by the Paper The paper primarily addresses the following issues: 1. **Small-scale changes and detail edge recognition**: Existing remote sensing image change detection (CD) methods perform poorly in extracting fine-grained features and have weak capabilities in capturing long-range spatiotemporal information, leading to the loss of minor changes and blurred edges of change types. To address this, a Transformer-based semantic change detection (SCD) model—Pyramid-SCDFormer—is proposed, which can accurately identify small-scale changes and the subtle edge details of change types. 2. **Multi-scale feature extraction**: By selectively merging different semantic tokens in multi-head self-attention blocks to obtain multi-scale features, which is crucial for extracting remote sensing image (RSI) information with various changes from different scales. 3. **Creating a large-scale SCD dataset**: To support model training and evaluation, the researchers created a new dataset named Landsat-SCD, which includes unprecedented time series and change types in complex scenarios, addressing the shortcomings of existing datasets in temporal and semantic diversity. 4. **Improved performance**: Experimental results show that Pyramid-SCDFormer consistently outperforms existing state-of-the-art CD models on the LEVIR-CD, WHU_CD, and Landsat-SCD datasets, with improvements of 1.11%/0.76%, 0.57%/0.50%, and 8.75%/8.59% in MIoU/F1 metrics, respectively. Particularly for change categories accounting for less than 1%, the proposed model improves MIoU by 7.17% to 19.53% on the Landsat-SCD dataset, significantly enhancing the recognition performance of small-scale changes and the edges of change types.