FARFusion V2: A Geometry-based Radar-Camera Fusion Method on the Ground for Roadside Far-Range 3D Object Detection

Yao Li,Jiajun Deng,Yuxuan Xiao,Yingjie Wang,Xiaomeng Chu,Jianmin Ji,Yanyong Zhang
DOI: https://doi.org/10.1145/3664647.3681128
2024-01-01
Abstract:Fusing the data of millimeter-wave Radar sensors and high-definition cameras has emerged as a viable approach to achieving precise 3D object detection for roadside traffic surveillance. For roadside perception systems, earlier studies have pointed out that it is better to perform the fusion on the 2D image plane than on the BEV plane (which is popular for on-car perception systems), especially when the perception range is large (e.g., >150m). Image-plane fusion requires critical transformations, like perspective projection from the Radar's BEV to the camera's 2D plane and reverse IPM. However, real-world issues like uneven terrain and sensor movement degrade these transformations' precision, impacting fusion effectiveness. To alleviate these issues, we propose a geometry-based Radar-camera fusion method on the ground, namely FARFusion V2. Specifically, we extend the ground-plane assumption in FARFusion[20] to support arbitrary shapes by formulating the ground height as an implicit representation based on geometric transformations. By incorporating the ground information, we can enhance Radar data with target height measurements. Consequently, we can thus project the enhanced Radar data onto the 2D plane to obtain more accurate depth information, thereby assisting the IPM process. A real-time parameterized transformation parameters estimation module is further introduced to refine the view transformation processes. Moreover, considering various measurement noises across these two sensors, we introduce an uncertainty-based depth fusion strategy into the 2D fusion process to maximize the probability of obtaining the optimal depth value. Extensive experiments are conducted on our collected roadside OWL benchmark, demonstrating the excellent localization capacity of FARFusion V2 in far-range scenarios. Our method achieves an average location accuracy of 0.771m when we extend the detection range up to 500m.
What problem does this paper attempt to address?