Applying the Lower-Biased Teacher Model in Semi-Supervised Object Detection

Shuang Wang
2024-10-04
Abstract:I present the Lower Biased Teacher model, an enhancement of the Unbiased Teacher model, specifically tailored for semi-supervised object detection tasks. The primary innovation of this model is the integration of a localization loss into the teacher model, which significantly improves the accuracy of pseudo-label generation. By addressing key issues such as class imbalance and the precision of bounding boxes, the Lower Biased Teacher model demonstrates superior performance in object detection tasks. Extensive experiments on multiple semi-supervised object detection datasets show that the Lower Biased Teacher model not only reduces the pseudo-labeling bias caused by class imbalances but also mitigates errors arising from incorrect bounding boxes. As a result, the model achieves higher mAP scores and more reliable detection outcomes compared to existing methods. This research underscores the importance of accurate pseudo-label generation and provides a robust framework for future advancements in semi-supervised learning for object detection.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to generate more accurate pseudo - labels and reduce the bias caused by class imbalance and incorrect bounding boxes in the semi - supervised object detection task. Specifically: 1. **Class imbalance problem**: In semi - supervised learning, class imbalance can lead to a decline in the quality of pseudo - labels, thus affecting the performance of the model. 2. **Bounding box positioning accuracy problem**: Existing semi - supervised methods often cannot handle the precise positioning of bounding boxes well when generating pseudo - labels, resulting in inaccurate detection results. To solve these problems, the author proposes an improved teacher model - the **Lower Biased Teacher (LBT) model**. The main innovation of this model is to introduce the localization loss into the teacher model, thereby significantly improving the accuracy of pseudo - label generation. In this way, the LBT model can better deal with the class imbalance problem and reduce the error caused by incorrect bounding boxes. ### Specific improvement measures - **Introducing localization loss**: In the traditional teacher model, only the classification loss is usually considered, while the LBT model additionally adds the localization loss to ensure that the generated pseudo - labels are not only of the correct class but also have more accurate bounding box positions. The formula for the localization loss is as follows: \[ l_{\text{loc}}(f(I), f(\hat{I}))=\frac{1}{4}\left(\|\Delta c_{x}-(-\Delta c'_{x})\|_{1}+\|\Delta c_{y}-\Delta c'_{y}\|_{1}+\|\Delta \omega-\Delta \omega'\|_{1}+\|\Delta h-\Delta h'\|_{1}\right) \] where \(f(I)\) and \(f(\hat{I})\) represent the prediction results of the original image and the flipped image respectively, and \(\Delta c_{x}, \Delta c_{y}, \Delta \omega, \Delta h\) represent the changes in the center position and scale coefficients of the candidate boxes. - **Dynamically adjusting the confidence threshold**: In order to prevent the influence of noisy pseudo - labels, the LBT model adopts a method of dynamically adjusting the confidence threshold. Only when the model has sufficient confidence in the prediction result will it be used as a pseudo - label. - **Non - maximum suppression (NMS)**: In order to deal with the problem of duplicate bounding box predictions, the LBT model performs non - maximum suppression (NMS) for classes before applying the confidence threshold to remove redundant prediction boxes. - **Consistency regularization**: The LBT model also introduces consistency regularization to ensure that the outputs between the student model and the teacher model are consistent, thereby improving the generalization ability of the model. Through these improvements, the LBT model has achieved results significantly superior to existing methods on multiple semi - supervised object detection datasets, especially in cases where there is less labeled data.