Fusion of the YOLOv4 Network Model and Visual Attention Mechanism to Detect Low-Quality Young Apples in a Complex Environment

Jiang Mei,Song Lei,Wang Yunfei,Li Zhenyu,Song Huaibo
DOI: https://doi.org/10.1007/s11119-021-09849-0
IF: 5.767
2022-01-01
Precision Agriculture
Abstract:The accurate detection of young fruits in complex scenes is of great significance for automatic fruit growth monitoring systems. The images obtained in the open orchard contain interference factors including strong illumination, blur and occlusion, and the image quality is low. To improve the detection accuracy of young apples in low-quality images, a novel young apple detection algorithm that fuses the YOLOv4 network model and visual attention mechanism was proposed. The Non-local attention module (NLAM) and Convolutional block attention model (CBAM) were added to the baseline of the YOLOv4 model, and the proposed model was named YOLOv4–NLAM–CBAM. NLAM was used to extract the long-range dependency information from high-level visual features; CBAMs were used to further enhance the perception ability of the region of interest (ROI). To verify the effectiveness of the proposed algorithm, 3 000 young apple images were used for training and testing. The results showed that the detection precision, recall rate, average precision and F1 score of the YOLOv4–NLAM–CBAM model were 85.8%, 97.3%, 97.2% and 91.2%, respectively, and the average run time was 35.1 ms. For highlight/shadow, blur, severe occlusion and other images in test set, the average precision of the proposed algorithm was 98.0%, 96.2%, 97.0% and 96.9%, respectively. The experimental results showed that this method can achieve high-efficiency detection of low-quality images. The method can provide a certain reference for the research on automatic monitoring of young fruit growth.
What problem does this paper attempt to address?