An Efficient Wide-Range Pseudo-3D Vehicle Detection Using A Single Camera

Zhupeng Ye,Yinqi Li,Zejian Yuan
DOI: https://doi.org/10.48550/arXiv.2309.08369
2023-09-15
Abstract:Wide-range and fine-grained vehicle detection plays a critical role in enabling active safety features in intelligent driving systems. However, existing vehicle detection methods based on rectangular bounding boxes (BBox) often struggle with perceiving wide-range objects, especially small objects at long distances. And BBox expression cannot provide detailed geometric shape and pose information of vehicles. This paper proposes a novel wide-range Pseudo-3D Vehicle Detection method based on images from a single camera and incorporates efficient learning methods. This model takes a spliced image as input, which is obtained by combining two sub-window images from a high-resolution image. This image format maximizes the utilization of limited image resolution to retain essential information about wide-range vehicle objects. To detect pseudo-3D objects, our model adopts specifically designed detection heads. These heads simultaneously output extended BBox and Side Projection Line (SPL) representations, which capture vehicle shapes and poses, enabling high-precision detection. To further enhance the performance of detection, a joint constraint loss combining both the object box and SPL is designed during model training, improving the efficiency, stability, and prediction accuracy of the model. Experimental results on our self-built dataset demonstrate that our model achieves favorable performance in wide-range pseudo-3D vehicle detection across multiple evaluation metrics. Our demo video has been placed at <a class="link-external link-https" href="https://www.youtube.com/watch?v=1gk1PmsQ5Q8" rel="external noopener nofollow">this https URL</a>.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to achieve wide - range and fine - grained vehicle detection in intelligent driving systems. Existing vehicle detection methods based on rectangular bounding boxes (BBox) have difficulties in perceiving small targets at long distances and cannot provide detailed geometric shape and pose information. Therefore, this paper proposes a new wide - range pseudo - 3D vehicle detection method based on monocular camera images, aiming to improve the high - precision detection ability for small - distance targets while maintaining high detection efficiency. Specifically, the paper proposes the following innovations: 1. **Double - Window Image (DW Image)**: By extracting two sub - window images from a high - resolution image and splicing them into an input image, the limited image resolution is fully utilized to retain the key information of wide - range vehicle objects. 2. **Pseudo - 3D Vehicle Representation (P3DVR)**: P3DVR consists of an extended bounding box (extended BBox) and a side projection line (Side Projection Line, SPL), which can describe the position, size, appearance and pose information of the vehicle. 3. **Joint - Constrained Loss**: A joint - constrained loss combining the extended BBox and SPL is designed to provide overall supervision information and enhance the consistency, stability and accuracy of the model prediction results. These innovation points together solve the deficiencies of existing methods in wide - range vehicle detection and improve the detection performance. Experimental results show that this method performs well on multiple evaluation metrics.