Collaborative Static-Dynamic Teaching: A Semi-Supervised Framework for Stripe-Like Space Target Detection

Zijian Zhu,Ali Zia,Xuesong Li,Bingbing Dan,Yuebo Ma,Hongfeng Long,Kaili Lu,Enhai Liu,Rujin Zhao
2024-08-09
Abstract:Stripe-like space target detection (SSTD) is crucial for space situational awareness. Traditional unsupervised methods often fail in low signal-to-noise ratio and variable stripe-like space targets scenarios, leading to weak generalization. Although fully supervised learning methods improve model generalization, they require extensive pixel-level labels for training. In the SSTD task, manually creating these labels is often inaccurate and labor-intensive. Semi-supervised learning (SSL) methods reduce the need for these labels and enhance model generalizability, but their performance is limited by pseudo-label quality. To address this, we introduce an innovative Collaborative Static-Dynamic Teacher (CSDT) SSL framework, which includes static and dynamic teacher models as well as a student model. This framework employs a customized adaptive pseudo-labeling (APL) strategy, transitioning from initial static teaching to adaptive collaborative teaching, guiding the student model's training. The exponential moving average (EMA) mechanism further enhances this process by feeding new stripe-like knowledge back to the dynamic teacher model through the student model, creating a positive feedback loop that continuously enhances the quality of pseudo-labels. Moreover, we present MSSA-Net, a novel SSTD network featuring a multi-scale dual-path convolution (MDPC) block and a feature map weighted attention (FMWA) block, designed to extract diverse stripe-like features within the CSDT SSL training framework. Extensive experiments verify the state-of-the-art performance of our framework on the AstroStripeSet and various ground-based and space-based real-world datasets.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is the insufficient generalization ability of traditional unsupervised methods and fully - supervised learning methods in the Strip - Shaped Spatial Target Detection (SSTD) task under the conditions of low Signal - to - Noise Ratio (SNR) and large variations in strip - shaped spatial targets. Specifically: 1. **Limitations of traditional unsupervised methods**: - Traditional unsupervised methods rely on manually customized filters or morphological operations. In the case of low SNR and large variations in strip - shaped targets, these methods lack generalization ability. - These methods are very sensitive to noise, perform poorly in low - SNR scenarios, and have high computational complexity. 2. **Limitations of fully - supervised learning methods**: - Although fully - supervised learning methods improve the generalization ability of the model, they require a large number of pixel - level labels for training. - In the SSTD task, manually creating these labels is not only time - consuming but also error - prone, and it is difficult to obtain accurate labeled data. 3. **Challenges of semi - supervised learning methods**: - Semi - supervised learning (SSL) methods reduce the dependence on labels and enhance the generalization ability of the model, but their performance is limited by the quality of pseudo - labels. - The existing single - teacher - student framework is prone to over - fitting, especially under different stray light conditions, which affects the generalization ability of the model. To solve the above problems, the author proposes an innovative Collaborative Static - Dynamic Teacher (CSDT) semi - supervised learning framework and a new SSTD network, MSSA - Net. The CSDT framework gradually improves the quality of pseudo - labels by introducing Static Teacher (ST) and Dynamic Teacher (DT) models and combining the Adaptive Pseudo - Label (APL) strategy, thereby enhancing the generalization ability of the model. MSSA - Net significantly improves the response ability to diverse and weak strip - shaped targets through the Multi - Scale Dual - Path Convolution (MDPC) block and the Feature Map Weighted Attention (FMWA) block. ### Formula summary 1. **Supervised loss**: \[ L_s=\frac{1}{|B_l|}\sum_{(x_i^l,y_i^l)\in B_l}l_d(f(x_i^l;\Theta_S),y_i^l) \] where \( l_d \) is the Dice loss, \( B_l \) is a batch of labeled images input to the student model, and \( \Theta_S \) is the weight parameter of the student model. 2. **Consistency loss**: \[ L_c = \frac{1}{|B_u|}\sum_{x_i^u\in B_u}l_m(f(x_i^u;\Theta_S),f(x_i^u;\Theta_{DT})) \] where \( l_m \) is the Mean Squared Error (MSE) loss, \( B_u \) is a batch of unlabeled images input to the student model, and \( \Theta_{DT} \) is the weight parameter of the dynamic teacher model. 3. **Pseudo - label supervised loss**: \[ L_u=\frac{1}{|B_u|}\sum_{x_i^u\in B_u}l_d(f(x_i^u;\Theta_S),\hat{y}_i^{u,p}) \] where \( \hat{y}_i^{u,p} \) is the optimal pseudo - label generated by the static teacher or dynamic teacher model. 4. **Total loss**: \[ L_t = L_s+\lambda_cL_c+\lambda_uL_u \] where \( \lambda_c \) is the weight of the consistency loss, which is dynamically adjusted with the training process; \( \lambda_u \) is a fixed constant. 5. **EMA update formula**: \[ \Theta_t^{DT}=\alpha\Theta_{t - 1}^{DT}+(1-\alpha)\Theta_t^S \] where \( \alpha \) is the decay rate, usually set between 0.9 and 0.999. Through these improvements,