Greenhouse Tomato Detection and Pose Classification Algorithm Based on Improved YOLOv5

Junxiong Zhang,Jinyi Xie,Fan Zhang,Jin Gao,Chen Yang,Chaoyu Song,Weijie Rao,Yu Zhang
DOI: https://doi.org/10.1016/j.compag.2023.108519
IF: 8.3
2023-01-01
Computers and Electronics in Agriculture
Abstract:In the tomato planting industry, picking is an important step in the fruit harvesting operation. Manual picking is time-consuming and labor-intensive, and automatic picking by machines is the main developing tendency. One of the main reasons for the low picking success rate of current tomato picking robots is picking failure due to collisions between the end-effector and tomato plants during the picking process. Therefore, this paper proposes a cascade deep learning network algorithm to reduce the collisions, identify the maturity, estimate the 3D poses, and search the collision-free picking strategy for tomatoes. The algorithm consists of three steps: tomato bunch detection; tomato detection and occlusion judgment; and the classification of maturity and poses, each task based on a trained YOLOv5s network. Combining with actual harvesting practices, an economical 3D pose estimation method is proposed, where the 3D pose estimation task is divided into two classification tasks, generating four typical 3D poses. Experimental results show that the YOLOv5-based visual detection and pose classification algorithm, whose input is RGB images, can detect unoccluded tomatoes and classify them for maturity and 3D poses with a detection speed of 20 fps. The detection accuracy of unoccluded tomatoes is 82.4 %, the recall rate is 90.9 %; the average accuracy of maturity classification is 96.9 %, and the average accuracy of 3D pose classification is 89.1 %.
What problem does this paper attempt to address?