YOLO-GP: A Multi-Scale Dangerous Behavior Detection Model Based on YOLOv8

Bushi Liu,Cuiying Yu,Bolun Chen,Yue Zhao
DOI: https://doi.org/10.3390/sym16060730
2024-06-13
Symmetry
Abstract:In recent years, frequent chemical production safety incidents in China have been primarily attributed to dangerous behaviors by workers. Current monitoring methods predominantly rely on manual supervision, which is not only inefficient but also prone to errors in complex environments and with varying target scales, leading to missed or incorrect detections. To address this issue, we propose a deep learning-based object detection model, YOLO-GP. First, we utilize a grouped pointwise convolutional (GPConv) module of symmetric structure to facilitate information exchange and feature fusion in the channel dimension, thereby extracting more accurate feature representations. Building upon the YOLOv8n model, we integrate the symmetric structure convolutional GPConv module and design the dual-branch aggregation module (DAM) and Efficient Spatial Pyramid Pooling (ESPP) module to enhance the richness of gradient flow information and the capture of multi-scale features, respectively. Finally, we develop a channel feature enhancement network (CFE-Net) to strengthen inter-channel interactions, improving the model's performance in complex scenarios. Experimental results demonstrate that YOLO-GP achieves a 1.56% and 11.46% improvement in the mAP@.5:.95 metric on a custom dangerous behavior dataset and a public Construction Site Safety Image Dataset, respectively, compared to the baseline model. This highlights its superiority in dangerous behavior object detection tasks. Furthermore, the enhancement in model performance provides an effective solution for improving accuracy and robustness, promising significant practical applications.
multidisciplinary sciences
What problem does this paper attempt to address?
The paper mainly addresses the frequent occurrence of safety accidents caused by workers' dangerous behaviors in chemical production, and proposes a multi-scale dangerous behavior detection model YOLO-GP based on YOLOv8. The paper points out that the current monitoring methods mostly rely on manual supervision, which is inefficient and prone to false negatives or false positives in complex environments and different target scales. To solve these problems, the research team designed the YOLO-GP model, which has the following characteristics: 1. Enhanced feature fusion and information exchange: By introducing group pointwise convolution (GPConv) modules with symmetric structures, the model enhances the information exchange and feature fusion in the channel dimension, thus extracting more accurate feature representations. 2. Dual-branch aggregation module (DAM): It replaces the C2f module in the original model to obtain richer gradient flow information, addressing the poor accuracy of the baseline model in locating dangerous behavior targets and small targets such as smoking or using phones. 3. Multi-scale feature capture: By integrating the innovative Efficient Spatial Pyramid Pooling (ESPP) module, the model improves its ability to recognize different scales of dangerous behaviors, enabling accurate understanding and differentiation of targets involving different scales. 4. Channel feature enhancement network (CFE-Net): To address the insufficient inter-channel correlation in complex scenes, the CFE-Net is designed to better understand the interactions between different channels, thereby improving the accuracy of dangerous behavior detection in complex scenes. Experimental results show that the YOLO-GP model achieves a 1.56% and 11.46% improvement in mAP@.5:.95 metrics compared to the baseline model on a custom dangerous behavior dataset and a publicly available construction site safety image dataset, highlighting its superiority in dangerous behavior object detection tasks. The improvement in model performance provides an effective solution for enhancing detection accuracy and robustness, and has important practical application value.