Real-time object detection method based on YOLOv5 and efficient mobile network
Shuai Feng,Huaming Qian,Huilin Wang,Wenna Wang
DOI: https://doi.org/10.1007/s11554-024-01433-9
IF: 2.293
2024-03-26
Journal of Real-Time Image Processing
Abstract:The object detection algorithm YOLOv5, which is based on deep learning, experiences inefficiencies due to an overabundance of model parameters and an overly complex structure. These drawbacks hinder its deployment on mobile devices, which are constrained by their computational capabilities and storage capacities. Addressing these limitations, we introduce a lightweight object detection algorithm that harnesses the coordinate attention (CA) mechanism in synergy with the YOLOv5 framework. Our approach embeds the CA mechanism into MobileNetv2 to create MobileNetv2-CA, thereby replacing the CSDarkNet53 as YOLOv5's backbone network. This innovation not only trims the model's parameter count but also maintains a competitive level of accuracy. Further amplifying performance, we propose a multi-scale fast spatial pyramid pooling (MSPPF) layer, designed to expedite and refine the model's handling of various input image dimensions. Complementing this, we incorporate MPANet, a feature fusion network comprising optimally designed upsampling and downsample modules, along with feature extraction cells. This configuration is devised to elevate detection precision while minimizing the parameter overhead. Empirical results showcase the prowess of our methodology: we achieve a mean average precision (mAP) of 87.6% on the PASCAL VOC07+12 dataset and an average precision (AP) of 39.4% on the MS COCO dataset, with the model's parameter size being a mere 10.1MB. When compared to the original YOLOv5, our proposed model achieves a parameter reduction of 76.9% and operates at a velocity that is 1.72 times faster, reaching 54.9 frames per second (FPS) on an NVIDIA RTX3060. Versus SOTA techniques, our method demonstrates a commendable equilibrium between accuracy and real-time performance.
computer science, artificial intelligence,engineering, electrical & electronic,imaging science & photographic technology