Lite-YOLOv3: a real-time object detector based on multi-scale slice depthwise convolution and lightweight attention mechanism

Yipeng Zhou,Huaming Qian,Peng Ding
DOI: https://doi.org/10.1007/s11554-023-01379-4
IF: 2.293
2023-11-08
Journal of Real-Time Image Processing
Abstract:Object detector performance gains often occur with deeper networks and heavier computational overhead, yet scenarios with constrained calculation and storage demand real-time performance while consuming fewer resources. Existing methods tend to be caught in a tough decision between parameters, computation, speed and accuracy. We propose a lightweight real-time object detector Lite-YOLOv3 from the optimization of YOLOv3. Firstly, sparse pruning of the trained model significantly decreases the parameters and calculations while boosting the speed. Secondly, a channel-wise convolution attention (CWA) mechanism is proposed to enhance the feature extraction capability of the backbone with essentially no extra computational burden. Furthermore, a multi-scale slice depthwise convolution with efficient channel attention (MSD-ECA) is proposed to enhance the receptive field and cross-scale information representation. Finally, SIoU is chosen as the localization loss function to improve the training speed and regression accuracy. For 512 512 input, Lite-YOLOv3 achieves 74.1 mAP at 113 FPS on the PASCAL VOC07+12 dataset and 52.4 mAP on the MS COCO2017 dataset. The experimental results show that compared with YOLOv3, Lite-YOLOv3 is slightly inferior in accuracy, the parameters and calculations are only 24.8 and 30.7 , respectively, the inference speed is 1.7 times faster, which sufficiently proves the effectiveness of the proposed method and is also comparable with other models.
computer science, artificial intelligence,engineering, electrical & electronic,imaging science & photographic technology
What problem does this paper attempt to address?