SITSMamba for Crop Classification based on Satellite Image Time Series

Xiaolei Qin,Xin Su,Liangpei Zhang
2024-09-29
Abstract:Satellite image time series (SITS) data provides continuous observations over time, allowing for the tracking of vegetation changes and growth patterns throughout the seasons and years. Numerous deep learning (DL) approaches using SITS for crop classification have emerged recently, with the latest approaches adopting Transformer for SITS classification. However, the quadratic complexity of self-attention in Transformer poses challenges for classifying long time series. While the cutting-edge Mamba architecture has demonstrated strength in various domains, including remote sensing image interpretation, its capacity to learn temporal representations in SITS data remains unexplored. Moreover, the existing SITS classification methods often depend solely on crop labels as supervision signals, which fails to fully exploit the temporal information. In this paper, we proposed a Satellite Image Time Series Mamba (SITSMamba) method for crop classification based on remote sensing time series data. The proposed SITSMamba contains a spatial encoder based on Convolutional Neural Networks (CNN) and a Mamba-based temporal encoder. To exploit richer temporal information from SITS, we design two branches of decoder used for different tasks. The first branch is a crop Classification Branch (CBranch), which includes a ConvBlock to decode the feature to a crop map. The second branch is a SITS Reconstruction Branch that uses a Linear layer to transform the encoded feature to predict the original input values. Furthermore, we design a Positional Weight (PW) applied to the RBranch to help the model learn rich latent knowledge from SITS. We also design two weighting factors to control the balance of the two branches during training. The code of SITSMamba is available at: <a class="link-external link-https" href="https://github.com/XiaoleiQinn/SITSMamba" rel="external noopener nofollow">this https URL</a>.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
### Problems the Paper Aims to Solve This paper aims to address several key issues faced in crop classification based on Satellite Image Time Series (SITS): 1. **Insufficient Utilization of Temporal Information**: Existing SITS classification methods typically rely only on crop labels as supervisory signals, failing to adequately emphasize the importance of temporal information. This results in models that are deficient in extracting temporal features, prone to overfitting, and have poor generalization capabilities. 2. **Computational Complexity of Long Sequence Processing**: Although Transformers perform well in handling time series data, their self-attention mechanism has quadratic complexity, leading to high computational burdens when processing long sequences and making it difficult to extract long-term features. 3. **Lack of a Multi-Task Learning Framework**: Existing methods lack a multi-task learning framework that can simultaneously utilize temporal and spatial information to improve the accuracy of crop classification. To address these issues, this paper proposes a method named SITSMamba, which combines Convolutional Neural Networks (CNN) and the Mamba architecture for crop classification based on remote sensing time series data. Specifically, SITSMamba includes the following features: - **Spatio-Temporal Encoder**: Uses CNN as the spatial encoder and Mamba as the temporal encoder to efficiently handle long sequence data. - **Multi-Task Learning Framework**: Designs two decoding branches, namely the Crop Classification Branch (CBranch) and the SITS Reconstruction Branch (RBranch). RBranch enhances the model's learning ability by reconstructing the original SITS data, thereby further improving the performance of CBranch. - **Position Weights (PW)**: To better supervise RBranch, a position weight mechanism is designed, where the weight of the loss increases with the temporal position. Additionally, two weight factors are designed to balance the learning process of CBranch and RBranch. Through these innovations, SITSMamba achieves overall accuracies (OA) of 0.7416 and 0.9104 on the PASTIS32 and MTLCC datasets, respectively, outperforming existing state-of-the-art methods.