Detection of Micromobility Vehicles in Urban Traffic Videos

Khalil Sabri,Célia Djilali,Guillaume-Alexandre Bilodeau,Nicolas Saunier,Wassim Bouachir
2024-10-30
Abstract:Urban traffic environments present unique challenges for object detection, particularly with the increasing presence of micromobility vehicles like e-scooters and bikes. To address this object detection problem, this work introduces an adapted detection model that combines the accuracy and speed of single-frame object detection with the richer features offered by video object detection frameworks. This is done by applying aggregated feature maps from consecutive frames processed through motion flow to the YOLOX architecture. This fusion brings a temporal perspective to YOLOX detection abilities, allowing for a better understanding of urban mobility patterns and substantially improving detection reliability. Tested on a custom dataset curated for urban micromobility scenarios, our model showcases substantial improvement over existing state-of-the-art methods, demonstrating the need to consider spatio-temporal information for detecting such small and thin objects. Our approach enhances detection in challenging conditions, including occlusions, ensuring temporal consistency, and effectively mitigating motion blur.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The paper attempts to address the challenges of detecting small moving vehicles (such as electric scooters, bicycles, etc.) in urban traffic videos. Specifically, the paper focuses on the following aspects: 1. **Detection Accuracy**: Existing single-frame object detection methods face issues such as detection inconsistency, motion blur, and occlusion when dealing with small moving vehicles in urban traffic, leading to low detection accuracy. 2. **Temporal Consistency**: In videos, single-frame detection methods cannot maintain temporal continuity, which is particularly important in real-time monitoring and autonomous driving. 3. **Multi-class Detection**: Most existing research focuses on detecting a single type of vehicle, while this paper aims to develop a system capable of detecting multiple types of small moving vehicles simultaneously. To address these challenges, the paper proposes a new detection model called FGFA-YOLOX, which combines the speed and accuracy of single-frame detection with the temporal context information of video object detection, improving detection performance by aggregating feature maps from consecutive frames.