RFA-YOLO-POSE: A Fusion Algorithm for Pose Detection and Object Identification Amidst Complex Crowds

Wenqi Xue,Yuanjian Zhang
DOI: https://doi.org/10.1109/AINIT61980.2024.10581583
2024-03-29
Abstract:This research introduces an algorithm that merges pose detection with object detection, leveraging YOLO-Pose as a foundation. By changing the backbone network, our model is enhanced to capture more detailed features. We integrated the CBAM attention mechanism for a more refined perception of input feature maps in both channels and spatial dimensions, improving the model's performance in complex crowd environments. The approach of using multi-scale feature fusion allows the model to adapt its judgments based on different scale feature maps. In our training process, we adopted the SIoU loss function in place of the standard IoU loss function, facilitating more precise bounding box adjustments. Our experiments, conducted on the Pedestrian Attribute Recognition dataset, excluded images with fewer pedestrians and incorporated additional images of dense crowds. The experimental outcomes show that our method reached a 90.14% accuracy and an 87.55% recall rate in the dataset, outperforming the traditional YOLO-POSE by 21.56% in detection accuracy, affirming its effectiveness in dense crowd scenarios. The versatility of the model in handling various detection tasks makes it invaluable for applications in safety surveillance, public security, crowd analysis, and analysis of social events
Computer Science
What problem does this paper attempt to address?