Abstract:Far-range perception through roadside sensors is crucial to the effectiveness of intelligent transportation systems. The main challenge of far-range perception is due to the difficulty of performing accurate object detection and tracking under far distances (e.g., > 150m) at a low cost. To cope with such challenges, deploying both millimeter wave Radars and high-definition (HD) cameras, and fusing their data for joint perception has become a common practice. The key to this solution, however, is the precise association between the two types of data, which are captured from different perspectives and have different degrees of measurement noises. Towards this goal, the first question is which plane to conduct the association, i.e., the 2D image plane or the BEV plane. We argue that the former is more suitable because the magnitude of location errors in the perspective projection points is smaller at far distances on the 2D plane and can lead to more accurate association. Thus, we first project the Radar-based target locations (on the BEV plane) to the 2D plane and then associate them with the camera-based object locations that are modeled as a point on each object. Subsequently, we map the camera-based object locations to the BEV plane through inverse projection mapping (IPM) with the corresponding depth information from the Radar data. Finally, we engage a BEV tracking module to generate target trajectories for traffic monitoring. Since our approach involves transformation between the 2D plane and BEV plane, we also devise a transformation parameters refining approach based on a depth scaling technique, utilising the above fusion process without requiring any additional devices such as GPS. We have deployed an actual testbed on an urban expressway and conducted extensive experiments to evaluate the effectiveness of our system. The results show that our system can improve AP BEV by 32%, and reduce the location error by 0.56m . Our system is capable of achieving an average location accuracy of 1.3m when we extend the detection range up to 500m . We thus believe that our proposed method offers a viable approach to efficient roadside far-range perception.

FARFusion V2: A Geometry-based Radar-Camera Fusion Method on the Ground for Roadside Far-Range 3D Object Detection

FARFusion: A Practical Roadside Radar-Camera Fusion System for Far-Range Perception

Fusing LiDAR and Radar with Pillars Attention for 3D Object Detection

Radar Voxel Fusion for 3D Object Detection

MVFusion: Multi-View 3D Object Detection with Semantic-aligned Radar and Camera Fusion

Radar-Camera Sensor Fusion for Joint Object Detection and Distance Estimation in Autonomous Vehicles

BEV-Radar: Bidirectional Radar-Camera Fusion for 3D Object Detection

Camera-Radar Fusion with Radar Channel Extension and Dual-CBAM-FPN for Object Detection

Radar and Camera Fusion for Multi-Task Sensing in Autonomous Driving

SparseFusion3D: Sparse Sensor Fusion for 3D object detection by Radar and Camera in Environmental Perception

Multi-Modal and Multi-Scale Fusion 3D Object Detection of 4D Radar and LiDAR for Autonomous Driving

HVDetFusion: A Simple and Robust Camera-Radar Fusion Framework

Bridging the View Disparity Between Radar and Camera Features for Multi-Modal Fusion 3D Object Detection

ROFusion: Efficient Object Detection using Hybrid Point-wise Radar-Optical Fusion

InterFusion: Interaction-based 4D Radar and LiDAR Fusion for 3D Object Detection

DPFT: Dual Perspective Fusion Transformer for Camera-Radar-based Object Detection

A Survey of Deep Learning Based Radar and Vision Fusion for 3D Object Detection in Autonomous Driving

RSA-fusion: radar spatial attention fusion for object detection and classification

Radar-Lidar Fusion for Object Detection by Designing Effective Convolution Networks

CenterFusion: Center-based Radar and Camera Fusion for 3D Object Detection

RI-Fusion: 3D Object Detection Using Enhanced Point Features With Range-Image Fusion for Autonomous Driving.