ChangeMamba: Remote Sensing Change Detection with Spatio-Temporal State Space Model

Hongruixuan Chen,Jian Song,Chengxi Han,Junshi Xia,Naoto Yokoya
DOI: https://doi.org/10.1109/TGRS.2024.3417253
2024-06-26
Abstract:Convolutional neural networks (CNN) and Transformers have made impressive progress in the field of remote sensing change detection (CD). However, both architectures have inherent shortcomings: CNN are constrained by a limited receptive field that may hinder their ability to capture broader spatial contexts, while Transformers are computationally intensive, making them costly to train and deploy on large datasets. Recently, the Mamba architecture, based on state space models, has shown remarkable performance in a series of natural language processing tasks, which can effectively compensate for the shortcomings of the above two architectures. In this paper, we explore for the first time the potential of the Mamba architecture for remote sensing CD tasks. We tailor the corresponding frameworks, called MambaBCD, MambaSCD, and MambaBDA, for binary change detection (BCD), semantic change detection (SCD), and building damage assessment (BDA), respectively. All three frameworks adopt the cutting-edge Visual Mamba architecture as the encoder, which allows full learning of global spatial contextual information from the input images. For the change decoder, which is available in all three architectures, we propose three spatio-temporal relationship modeling mechanisms, which can be naturally combined with the Mamba architecture and fully utilize its attribute to achieve spatio-temporal interaction of multi-temporal features, thereby obtaining accurate change information. On five benchmark datasets, our proposed frameworks outperform current CNN- and Transformer-based approaches without using any complex training strategies or tricks, fully demonstrating the potential of the Mamba architecture in CD tasks. Further experiments show that our architecture is quite robust to degraded data. The source code will be available in <a class="link-external link-https" href="https://github.com/ChenHongruixuan/MambaCD" rel="external noopener nofollow">this https URL</a>
Image and Video Processing,Artificial Intelligence,Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The paper primarily aims to address the technical challenges in the task of remote sensing image change detection (CD), particularly focusing on the limitations faced by Convolutional Neural Networks (CNN) and Transformer architectures in handling such tasks. Specifically, the paper attempts to solve the following key issues: 1. **Overcoming the limitations of CNN and Transformer**: - **Issues with CNN**: CNN is limited by its finite receptive field, which may hinder its ability to capture broader spatial contextual information. - **Issues with Transformer**: Although the Transformer can effectively capture long-range dependencies, its computational cost is high, especially during training and deployment on large-scale datasets. 2. **Introducing the Mamba architecture**: - The paper explores for the first time the potential application of the state-space model-based Mamba architecture in remote sensing change detection tasks. The Mamba architecture can effectively compensate for the shortcomings of CNN and Transformer and has already demonstrated excellent performance in a range of natural language processing tasks. 3. **Designing specific change detection frameworks**: - For the three change detection sub-tasks of binary change detection (BCD), semantic change detection (SCD), and building damage assessment (BDA), the paper proposes three frameworks: MambaBCD, MambaSCD, and MambaBDA, respectively. 4. **Proposing spatiotemporal relationship modeling mechanisms**: - In the proposed frameworks, three spatiotemporal relationship modeling mechanisms are designed, which can naturally integrate with the Mamba architecture to achieve spatiotemporal interactions among multi-temporal features, thereby obtaining accurate change information. 5. **Validating performance**: - On five benchmark datasets, the proposed frameworks surpass existing CNN and Transformer-based methods without the need for complex training strategies or techniques, fully demonstrating the potential of the Mamba architecture in change detection tasks. In summary, this paper aims to improve existing methods by introducing the Mamba architecture and its customized change detection frameworks, enhancing the accuracy, efficiency, and robustness of change detection tasks.