MSOAR-YOLOv10: Multi-Scale Occluded Apple Detection for Enhanced Harvest Robotics

Heng Fu,Zhengwei Guo,Qingchun Feng,Feng Xie,Yijing Zuo,Tao Li
DOI: https://doi.org/10.3390/horticulturae10121246
2024-01-01
Horticulturae
Abstract:The accuracy of apple fruit recognition in orchard environments is significantly affected by factors such as occlusion and lighting variations, leading to issues such as missed and false detections. To address these challenges, particularly related to occluded apples, this study proposes an improved apple-detection model, MSOAR-YOLOv10, based on YOLOv10. Firstly, a multi-scale feature fusion network is enhanced by adding a 160 × 160 feature scale layer to the backbone network, which increases the model’s sensitivity to small local features, particularly for occluded fruits. Secondly, the Squeeze-and-Excitation (SE) attention mechanism is integrated into the C2fCIB convolution module of the backbone network to improve the network’s focus on the regions of interest in the input images. Additionally, a Diverse Branch Block (DBB) module is introduced to enhance the performance of the convolutional neural network. Furthermore, a Normalized Wasserstein Distance (NWD) loss function is proposed to effectively reduce missed detections of densely packed and overlapping targets. Experimental results in orchards indicate that the proposed improved YOLOv10 model achieves precision, recall, and mean average precision rates of 89.3%, 89.8%, and 92.8%, respectively, representing increases of 3.1%, 2.2%, and 3.0% compared to the original YOLOv10 model. These results validate that the proposed network significantly enhances apple recognition accuracy in complex orchard environments, particularly improving the operational precision of harvesting robots in real-world conditions.
What problem does this paper attempt to address?