Optimizing YOLOv5s Object Detection through Knowledge Distillation algorithm

Guanming Huang,Aoran Shen,Yuxiang Hu,Junliang Du,Jiacheng Hu,Yingbin Liang
2024-10-16
Abstract:This paper explores the application of knowledge distillation technology in target detection tasks, especially the impact of different distillation temperatures on the performance of student models. By using YOLOv5l as the teacher network and a smaller YOLOv5s as the student network, we found that with the increase of distillation temperature, the student's detection accuracy gradually improved, and finally achieved mAP50 and mAP50-95 indicators that were better than the original YOLOv5s model at a specific temperature. Experimental results show that appropriate knowledge distillation strategies can not only improve the accuracy of the model but also help improve the reliability and stability of the model in practical applications. This paper also records in detail the accuracy curve and loss function descent curve during the model training process and shows that the model converges to a stable state after 150 training cycles. These findings provide a theoretical basis and technical reference for further optimizing target detection algorithms.
Computer Vision and Pattern Recognition,Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: How to optimize the performance of the YOLOv5s object detection model through Knowledge Distillation technology, especially the impact of different distillation temperatures on the performance of the small model (student model). Specifically, the paper uses the larger YOLOv5l model as the teacher network and the smaller YOLOv5s model as the student network to study whether the detection accuracy of the student model can be improved under different distillation temperatures. The experimental results show that as the distillation temperature increases, the detection accuracy of the student gradually improves and exceeds the mAP50 and mAP50 - 95 indicators of the original YOLOv5s model at a specific temperature. In addition, the paper also records in detail the accuracy curve and the loss function decline curve during the model training process, showing that the model converges to a stable state after 150 training cycles. These findings provide a theoretical basis and technical reference for optimizing object detection algorithms. Summary: - **Problem**: How to optimize the object detection performance of YOLOv5s through Knowledge Distillation. - **Method**: Use YOLOv5l as the teacher model and YOLOv5s as the student model to study the impact of different distillation temperatures. - **Result**: An appropriate Knowledge Distillation strategy not only improves the accuracy of the model but also enhances the reliability and stability of the model in practical applications. The following is the Markdown - format display of the formula part: ### Loss Function The loss function of the key distillation area for classification prediction is: \[ L_{cls}^{md} = \sum (\log(P_s) \cdot y + \log(1 - P_s) \cdot (1 - y)) \] The cross - entropy loss for classification prediction is: \[ L_{cls}^{CE} = \sum (\log(P_t) \cdot y + \log(1 - P_t) \cdot (1 - y)) \] The cross - entropy loss between the classification result of the student network and the true label is: \[ L_{cls}^{CE} = \sum (\log(P_s) \cdot y + \log(1 - P_s) \cdot (1 - y)) \] The confidence loss function is: \[ L_{obj} = \sum ((1 - O_t) \cdot O_s + (1 - O_s) \cdot O_t) \] ### Bounding Box Representation For a given bounding box \( B \), the probability distribution of its edges is represented as: \[ Be = \int_{x_{min}}^{x_{max}} Pr(x) dx \] where the range of regression coordinates is \([x_{min}, x_{max}]\), and \( Pr(x) \) is the corresponding probability. When \( x = gt \), \( Pr(x) = 1 \), otherwise \( Pr(x) = 0 \). The continuous regression range is quantized into uniform discrete variables \([e_1, e_2,..., e_n]\), and the probability distribution of each edge is represented by the softmax function. Hope this information can help you better understand the core content of the paper.