A Lightweight Model Based on YOLOv8n in Wheat Spike Detection
Xuyang Ban,Pan Liu,Lei Xu,Jinling Zhao
DOI: https://doi.org/10.1109/agro-geoinformatics59224.2023.10233526
2023-01-01
Abstract:Wheat is a major cereal crop in China, and its yield and quality play an important role in ensuring national food security. Wheat ears are the key to predicting and evaluating wheat yields, so fast and accurate wheat ear detection on mobile devices and wheat yield counting are of great importance for modern, intelligent agricultural mass production. There are many problems with detecting wheat ears: wheat is densely distributed, overlapping and obscuring each other, leading to error and miss detection; wheat is not easy to distinguish from complex weed backgrounds; and the appearance of wheat ears varies depending on the growth period, color and type of wheat they are in. Existing target detection algorithms have problems such as large models, high computing requirements and long computation times, which are not suitable for configuration on portable devices for real-time field calculations. In order to achieve real-time detection counting in the field, it is crucial to ensure the accuracy of model detection while reducing the number of parameters and computation time.In this paper, Global Wheat Head Detection (GWHD) is selected as the dataset, and after comparing YOLOv5, YOLOv7, and YOLOv8, YOLOv8, which is superior in comprehensive model size and accuracy, is chosen as the baseline, and the n-model with smaller model depth and convolution channel is selected. Due to the dense and small size of wheat ears in the dataset, the feature information is lost after repeated downsampling, and it is difficult for the P5 layer detection head with lower resolution to detect the information of small targets. Therefore, the P5 feature layer is removed, and the P3 and P4 feature layers, which contain more target information with less downsampling, are used for feature extraction, which can focus more on the detection of small targets of wheat ears and also reduce the model size. Replacing the Conv in the network with a lightweight convolutional layer, Depthwise Conv, lightens the model, reduces the model size, decreases the number of model parameters, and speeds up the detection. The improved method achieves an average accuracy (mAP@0.5) of 94.3%, and the model size, number of parameters and FPS reach 3.19 MB, 6.2 GFLOPs and 300, respectively. The improved method proposed in this paper reduces the model size, decreases the number of parameters and speeds up the model detection without degrading the accuracy. In the future, the model can be further deployed to mobile devices for real-time wheat spike detection in the field, which can predict the wheat yield in a certain area.