Precise Spatial Transformation Mechanism for Small-Size-Aware Roadside 3D Object Detection on Traffic Surveillance Cameras

Jianxiao Zhu,Xu Li,Qimin Xu,Xixiang Liu,Haitao Wang,Tao Jiang
DOI: https://doi.org/10.1109/jsen.2024.3462751
IF: 4.3
2024-01-01
IEEE Sensors Journal
Abstract:Autonomous driving vehicles are often troubled by blind spots in mixed traffic scenarios, where pedestrians and vehicles coexist and interact. This problem can be effectively solved by introducing the roadside three-dimensional (3D) object detection, which compensates the limited observation views of self-vehicles by perceiving in a broad blind-less roadside perspective. To output the roadside 3D detection results, the prediction-based forward mechanism and sampling-based backward mechanism in the spatial transformation from image space to 3D or birds-eye-view (BEV) space are introduced by recent works. However, these methods are mainly concerned on the detection of general large objects, and the vulnerable small-size pedestrians and cyclists have been less addressed. In this paper, the sampling-refining mechanism is proposed to generate fine-grained 3D feature maps for small-size-aware roadside 3D object detection. Specifically, the spatial transformation process is accomplished by introducing the sampling stage for image feature aggregation and the refining stage for spatial allocation of the sampled features. Besides, a geometric-prompt-based height prediction module is advanced to estimate the height probabilities of each pixel and assist the process of spatial allocation. To further boost the performance, a CNN-Transformer hybrid BEV backbone is designed to compress 3D features into BEV space and retrieve height information with height-wise Transformer. Experiments on typical roadside datasets DAIR-V2X-I and Rope3D demonstrate that the proposed method exceeds the performance of state-of-the-art algorithms across all difficulty levels of the small-size objects, highlighting the effectiveness of our approach.
What problem does this paper attempt to address?