YOLOv5s-BC: An improved YOLOv5s-based method for real-time apple detection

Jingfan Liu,Zhaobing Liu
2023-11-10
Abstract:To address the issues associated with the existing algorithms for the current apple detection, this study proposes an improved YOLOv5s-based method, named YOLOv5s-BC, for real-time apple detection, in which a series of modifications have been introduced. Firstly, a coordinate attention (CA) block has been incorporated into the backbone module to construct a new backbone network. Secondly, the original concatenation operation has been replaced with a bidirectional feature pyramid network (BiFPN) in the neck module. Lastly, a new detection head has been added to the head module, enabling the detection of smaller and more distant targets within the field of view of the robot. The proposed YOLOv5s-BC model was compared to several target detection algorithms, including YOLOv5s, YOLOv4, YOLOv3, SSD, Faster R-CNN (ResNet50), and Faster R-CNN (VGG), with significant improvements of 4.6%, 3.6%, 20.48%, 23.22%, 15.27%, and 15.59% in mAP, respectively. The detection accuracy of the proposed model is also greatly enhanced over the original YOLOv5s model. The model boasts an average detection speed of 0.018 seconds per image, and the weight size is only 16.7 Mb with 4.7 Mb smaller than that of YOLOv8s, meeting the real-time requirements for the picking robot. Furthermore, according to the heat map, our proposed model can focus more on and learn the high-level features of the target apples, and recognize the smaller target apples better than the original YOLOv5s model. Then, in other apple orchard tests, the model can detect the pickable apples in real time and correctly, illustrating a decent generalization ability.
Image and Video Processing
What problem does this paper attempt to address?
### Problems the Paper Aims to Solve The paper aims to address the issue where existing apple detection algorithms cannot accurately distinguish between occluded apples and pickable apples, leading to low accuracy in apple harvesting, with high rates of false picks or missed picks. To tackle these problems, the study proposes an improved method based on YOLOv5s, named YOLOv5s-BC, for real-time apple detection. ### Specific Improvements 1. **Introduction of Coordinate Attention (CA) Module**: - The CA module is added to the backbone network to construct a new backbone network, enhancing the model's ability to focus on key parts. - The CA mechanism is also introduced in the neck module to improve the feature pyramid's discrimination ability at different scales, enhancing the detection performance of small targets. 2. **Replacement of Original Connection Operations**: - The original connection operations in the neck module are replaced with a Bidirectional Feature Pyramid Network (BiFPN) to build a more stable and accurate feature pyramid and adjust adaptive weights. 3. **Addition of New Detection Head**: - A new detection head is added in the head module to use high-resolution feature maps to detect farther small targets, improving the accuracy of target detection and localization. ### Experimental Results - **Performance Comparison**: - Compared to YOLOv5s, YOLOv4, YOLOv3, SSD, Faster R-CNN (ResNet50), and Faster R-CNN (VGG), the YOLOv5s-BC model improved mAP by 4.6%, 3.6%, 20.48%, 23.22%, 15.27%, and 15.59%, respectively. - The model's average detection speed is 0.018 seconds per image, and the model size is only 16.7 MB, which is 4.7 MB smaller than YOLOv8s, meeting the real-time detection requirements of picking robots. - **Heatmap Analysis**: - According to the heatmap, the proposed model can better focus on and learn the high-level features of target apples, improving the recognition of smaller target apples. - **Orchard Testing**: - In tests conducted in other apple orchards, the model was able to detect pickable apples in real-time and accurately, demonstrating good generalization ability. ### Conclusion The YOLOv5s-BC model proposed in this study not only improves the accuracy of apple detection but also maintains a high detection speed, providing technical support for apple harvesting robots, especially in real-time target detection and harvest sequence planning.