Research on Multi-objective Detection Algorithm Based on Improved YOLOv5s Model

Chu Tao,Jiang Liang,Gong Jie,Wang Chengchen,Xu Fuhui,Zhang Dexiang
DOI: https://doi.org/10.23919/ccc63176.2024.10662766
2024-01-01
Abstract:Because the real monitoring scene in the city is a complex scene with different situations, there are often various factors such as the change of monitoring object size, local masking, poor contrast and so on. A multi-objective detection algorithm of full scene monitoring based on YOLOv5s model and attention mechanism is proposed. In order to improve the adaptability of the network to the change of object size, a multiscale detection network structure based on YOLOv5s model is designed. Furthermore, an attention mechanism is introduced to focus on more representative parts and suppress less important information. The feature extraction module of attention mechanism is used to suppress background features to enhance object features, and the channel level weight of features is obtained through network learning to improve the network extraction ability of features. At the same time, K-means clustering algorithm is introduced to calculate the initial anchor frame size of the whole scene monitoring data set, so as to improve the convergence speed of the model and the accuracy of object detection. Experimental results demonstrated that the multi-objective detection algorithm based on YOLOv5s model and attention mechanism is effective and efficient in complex urban surveillance scene. Compared with the basic network, the average accuracy of MAP and mAP 50 are greatly improved, and the model reasoning time is reduced. In the whole scene monitoring data set, the mAP50 reaches $89.6 \%$. When processing the monitoring video, the Frames Per Second (FPS) is 154 frames / s, which meets the real-time detection requirements of the monitoring site.
What problem does this paper attempt to address?