A dual-difference change detection network for detecting building changes on high-resolution remote sensing images

Zhongrong Xu,Chengkun Zhang,Jun Qi,Xilai Li,Bin Yao,Lu Wang
DOI: https://doi.org/10.1080/10106049.2024.2322080
IF: 3.45
2024-01-01
Geocarto International
Abstract:Existing deep learning-based change detection networks encounter challenges related to the temporal dependency inherent in dual-temporal images. In this study, a weight-shared dual-difference change detection network(DDCDNet) model is proposed based on feature extraction networks. The model employs feature discrimination modules fused with spatial and channel attention mechanisms at different hierarchical levels of the backbone network. In the encoding phase, the tiny version of the Swin Transformer is utilized as the backbone network, with a weight-sharing strategy applied to extract feature information from bi-temporal remote sensing images. The proposed model in this paper is experimented on the LEVIR-CD + and DSIFN datasets, achieving F1-scores (F1) of 87.71% and 85.79%,recalls of 83.87% and 81.17%, and IoU (Intersection over Union) scores of 78.11% and 75.12%, respectively. These results indicate that the proposed model significantly outperforms other comparative models, demonstrating a better capability of identifying temporal changes in buildings, excellent generalization capability.
geosciences, multidisciplinary,environmental sciences,remote sensing,imaging science & photographic technology
What problem does this paper attempt to address?
### Problems the paper attempts to solve The paper aims to address the challenges of building change detection in high - resolution remote - sensing images. Specifically, existing deep - learning - based change - detection networks have difficulties in dealing with the temporal dependence of bi - temporal images. To tackle this problem, the paper proposes a Weight - Shared Dual - Difference Change Detection Network (DDCDNet) model based on a feature extraction network. This model applies feature discrimination modules at different levels of the backbone network by fusing spatial and channel attention mechanisms to improve the ability to detect building changes. ### Main contributions 1. **Novel building change - detection algorithm**: A building change - detection algorithm for high - resolution remote - sensing images based on the Siamese network is proposed. This method effectively solves the problem of uncertain building edge changes by making full use of the rich features in remote - sensing images and combining low - level detailed information with high - level semantic information. 2. **Feature - processing module**: A feature - processing module is designed in the encoding stage to handle the differences in the number of channels and the size of feature maps between low - level and high - level features. This module enhances the network's ability to extract local - information features, improves the effect of context modeling, thereby reducing noise interference and enhancing the overall performance and efficiency of the model. 3. **Detailed experiments and visual analysis**: Detailed experiments and visual analysis are carried out on multiple datasets to verify the superior performance and robustness of the proposed method in the building change - detection task. ### Method overview 1. **Overall architecture**: The DDCDNet model adopts an encoder - decoder structure, and its core components include a weight - shared Swin - tiny backbone network, a spatial difference module, a dilated convolution pooling difference module, and a multi - level feature - fusion module. 2. **Weight - shared Swin - tiny backbone network**: The Swin Transformer tiny version is used to construct dual encoders to process bi - temporal images of the previous and subsequent time phases respectively, achieving multi - scale semantic feature extraction. 3. **Spatial difference module (difference1 module)**: It processes the feature information in the shallow stage and captures feature information at different scales through the spatial attention mechanism to reduce noise interference. 4. **Dilated convolution pooling difference module (difference2 module)**: It processes the feature information in the deep stage, uses the ASPP module to extract feature information at different scales, and enhances the model's global information - capture ability. 5. **Multi - level feature - fusion module (MLFF module)**: It effectively integrates feature information at different levels, combines low - level detailed information with high - level semantic information, and improves the comprehensive ability of feature extraction. ### Experimental results The paper conducts experiments on two datasets, LEVIR - CD⁺ and DSIFN, and the evaluation metrics include F1 - score, recall rate, intersection - over - union (IoU), and precision rate. The experimental results show that the proposed DDCDNet model outperforms other comparison models on these two datasets. In particular, on the DSIFN dataset, the F1 - score and recall rate are increased by 12.38% and 1.36% respectively. These results indicate that the proposed model has higher accuracy and robustness in the building change - detection task.