ROFusion: Efficient Object Detection using Hybrid Point-wise Radar-Optical Fusion

Liu Liu,Shuaifeng Zhi,Zhenhua Du,Li Liu,Xinyu Zhang,Kai Huo,Weidong Jiang
DOI: https://doi.org/10.48550/arXiv.2307.08233
2023-07-17
Abstract:Radars, due to their robustness to adverse weather conditions and ability to measure object motions, have served in autonomous driving and intelligent agents for years. However, Radar-based perception suffers from its unintuitive sensing data, which lack of semantic and structural information of scenes. To tackle this problem, camera and Radar sensor fusion has been investigated as a trending strategy with low cost, high reliability and strong maintenance. While most recent works explore how to explore Radar point clouds and images, rich contextual information within Radar observation are discarded. In this paper, we propose a hybrid point-wise Radar-Optical fusion approach for object detection in autonomous driving scenarios. The framework benefits from dense contextual information from both the range-doppler spectrum and images which are integrated to learn a multi-modal feature representation. Furthermore, we propose a novel local coordinate formulation, tackling the object detection task in an object-centric coordinate. Extensive results show that with the information gained from optical images, we could achieve leading performance in object detection (97.69\% recall) compared to recent state-of-the-art methods FFT-RadNet (82.86\% recall). Ablation studies verify the key design choices and practicability of our approach given machine generated imperfect detections. The code will be available at <a class="link-external link-https" href="https://github.com/LiuLiu-55/ROFusion" rel="external noopener nofollow">this https URL</a>.
Computer Vision and Pattern Recognition,Artificial Intelligence
What problem does this paper attempt to address?
The paper aims to address the problem of object detection in autonomous driving scenarios, specifically how to effectively fuse Radar and Optical Image data to improve detection performance. Specifically, radar has strong robustness under adverse weather conditions and can measure the distance and speed information of objects, but it lacks semantic and structural information in perception data; whereas camera images are rich in semantic information, but their performance significantly degrades under adverse weather conditions. To overcome these challenges, the authors propose a new method called ROFusion, which is a hybrid point-level radar-optical fusion framework. This method learns multimodal feature representations by combining radar's range-Doppler spectrogram and image features, and proposes a new local coordinate system formula that decomposes the object detection task into classification and regression sub-tasks in the target-centered coordinate system. Experimental results show that compared to existing state-of-the-art methods, ROFusion achieves significantly better detection performance on the public RADIal dataset, especially in difficult situations (e.g., when there is a lot of interference), with a recall rate improvement of 27.65% and a reduction in range error. Additionally, the authors conducted ablation studies to verify the critical role of the local coordinate system formula and the image fusion module in the overall performance.