A Siamese Swin-Unet for image change detection

Yizhuo Tang,Zhengtao Cao,Ningbo Guo,Mingyong Jiang
DOI: https://doi.org/10.1038/s41598-024-54096-8
IF: 4.6
2024-02-26
Scientific Reports
Abstract:The problem of change detection in remote sensing image processing is both difficult and important. It is extensively used in a variety of sectors, including land resource planning, monitoring and forecasting of agricultural plant health, and monitoring and assessment of natural disasters. Remote sensing images provide a large amount of long-term and fully covered data for earth environmental monitoring. A lot of progress has been made thanks to deep learning's quick development. But the majority of deep learning-based change detection techniques currently in use rely on the well-known Convolutional neural network (CNN). However, considering the locality of convolutional operation, CNN unable to master the interplay between global and distant semantic information. Some researches has employ Vision Transformer as a backbone in remote sensing field. Inspired by these researches, in this paper, we propose a network named Siam-Swin-Unet, which is a Siamesed pure Transformer with U-shape construction for remote sensing image change detection. Swin Transformer is a hierarchical vision transformer with shifted windows that can extract global feature. To learn local and global semantic feature information, the dual-time image are fed into Siam-Swin-Unet which is composed of Swin Transformer, Unet Siamesenet and two feature fusion module. Considered the Unet and Siamesenet are effective for change detection, We applied it to the model. The feature fusion module is designed for fusion of dual-time image features, and is efficient and low-compute confirmed by our experiments. Our network achieved 94.67 F1 on the CDD dataset (season varying).
multidisciplinary sciences
What problem does this paper attempt to address?
The paper attempts to address the challenges in remote sensing image change detection. Specifically, Change Detection (CD) aims to detect the change areas between remote sensing images taken at two different times. These changes usually refer to changes in land cover or land use status, identified through visual interpretation or related algorithms. Traditional methods for remote sensing image change detection face issues of high complexity and low accuracy, and require high-quality bi-temporal images. To overcome these problems, this paper proposes a new network structure—Siam-Swin-Unet, for remote sensing image change detection. This network combines the advantages of Siamese networks, Swin Transformer, and U-Net, effectively merging the features of bi-temporal images while extracting local and global semantic feature information, thereby improving the accuracy of change detection. The main contributions include: 1. Combining Siamese networks and the improved Swin-Unet, applied to remote sensing image change detection. 2. Reducing model parameters by fusing bi-temporal image features through addition operations instead of convolution operations. 3. Performing upsampling on bi-temporal image features separately after the feature fusion module to preserve feature fusion. 4. Investigating the impact of different Swin Transformer window sizes on the change detection performance of the CDD dataset. Experimental results show that the model achieved an F1 score of 94.67% on the CDD dataset, outperforming other comparison models, demonstrating its effectiveness and superiority in the change detection task.