Abstract:Introduction: In the field of facility agriculture, the accurate identification of tomatoes at multiple stages has become a significant area of research. However, accurately identifying and localizing tomatoes in complex environments is a formidable challenge. Complex working conditions can impair the performance of conventional detection techniques, underscoring the necessity for more robust methods. Methods: To address this issue, we propose a novel model of YOLOv8-EA for the localization and identification of tomato fruit. The model incorporates a number of significant enhancements. Firstly, the EfficientViT network replaces the original YOLOv8 backbone network, which has the effect of reducing the number of model parameters and improving the capability of the network to extract features. Secondly, some of the convolutions were integrated into the C2f module to create the C2f-Faster module, which facilitates the inference process of the model. Third, the bounding box loss function was modified to SIoU, thereby accelerating model convergence and enhancing detection accuracy. Lastly, the Auxiliary Detection Head (Aux-Head) module was incorporated to augment the network's learning capacity. Result: The accuracy, recall, and average precision of the YOLOv8-EA model on the self-constructed dataset were 91.4%, 88.7%, and 93.9%, respectively, with a detection speed of 163.33 frames/s. In comparison to the baseline YOLOv8n network, the model weight was increased by 2.07 MB, and the accuracy, recall, and average precision were enhanced by 10.9, 11.7, and 7.2 percentage points, respectively. The accuracy, recall, and average precision increased by 10.9, 11.7, and 7.2 percentage points, respectively, while the detection speed increased by 42.1%. The detection precision for unripe, semi-ripe, and ripe tomatoes was 97.1%, 91%, and 93.7%, respectively. On the public dataset, the accuracy, recall, and average precision of YOLOv8-EA are 91%, 89.2%, and 95.1%, respectively, and the detection speed is 1.8 ms, which is 4, 4.21, and 3.9 percentage points higher than the baseline YOLOv8n network. This represents an 18.2% improvement in detection speed, which demonstrates good generalization ability. Discussion: The reliability of YOLOv8-EA in identifying and locating multi-stage tomato fruits in complex environments demonstrates its efficacy in this regard and provides a technical foundation for the development of intelligent tomato picking devices.

Visual recognition of cherry tomatoes in plant factory based on improved deep instance segmentation

An Improved Yolov3 Based on Dual Path Network for Cherry Tomatoes Detection

A Transformer-Based Mask R-CNN for Tomato Detection and Segmentation

Automatic Detection of Single Ripe Tomato on Plant Combining Faster R-CNN and Intuitionistic Fuzzy Set

Tomato Recognition and Localization Method Based on Improved YOLOv5n-seg Model and Binocular Stereo Vision

Detection and Segmentation of Mature Green Tomatoes Based on Mask R-CNN with Automatic Image Acquisition Approach

RTMFusion: an Enhanced Dual-Stream Architecture Algorithm Fusing RGB and Depth Features for Instance Segmentation of Tomato Organs

An occluded cherry tomato recognition model based on improved YOLOv7

Accurate segmentation of green fruit based on optimized mask RCNN application in complex orchard

Multi-stage tomato fruit recognition method based on improved YOLOv8

MTA-YOLACT: Multitask-aware Network on Fruit Bunch Identification for Cherry Tomato Robotic Harvesting

Tomato Recognition Method Based on the YOLOv8-Tomato Model in Complex Greenhouse Environments

Robotic Harvesting of the Occluded Fruits with a Precise Shape and Position Reconstruction Approach

A novel and high precision tomato maturity recognition algorithm based on multi-level deep residual network

A Lightweight Cherry Tomato Maturity Real-Time Detection Algorithm Based on Improved YOLOV5n

A fine recognition method of strawberry ripeness combining Mask R-CNN and region segmentation

An improved faster R-CNN model for multi-object tomato maturity detection in complex scenarios

Robust Tomato Recognition for Robotic Harvesting Using Feature Images Fusion

Intelligent Tomato Picking Robot System Based on Multimodal Depth Feature Analysis Method

Efficient tomato harvesting robot based on image processing and deep learning

Integrated detection of citrus fruits and branches using a convolutional neural network