RVNet: Deep Sensor Fusion of Monocular Camera and Radar for Image-Based Obstacle Detection in Challenging Environments

Vijay John,Seiichi Mita
DOI: https://doi.org/10.1007/978-3-030-34879-3_27
2019-01-01
Abstract:AbstractCamera and radar-based obstacle detection are important research topics in environment perception for autonomous driving. Camera-based obstacle detection reports state-of-the-art accuracy, but the performance is limited in challenging environments. In challenging environments, the camera features are noisy, limiting the detection accuracy. In comparison, the radar-based obstacle detection methods using the 77 GHZ long-range radar are not affected by these challenging environments. However, the radar features are sparse with no delineation of the obstacles. The camera and radar features are complementary, and their fusion results in robust obstacle detection in varied environments. Once calibrated, the radar features can be used for localization of the image obstacles, while the camera features can be used for the delineation of the localized obstacles. We propose a novel deep learning-based sensor fusion framework, termed as the “RVNet”, for the effective fusion of the monocular camera and long-range radar for obstacle detection. The RVNet is a single shot object detection network with two input branches and two output branches. The RVNet input branches contain separate branches for the monocular camera and the radar features. The radar features are formulated using a novel feature descriptor, termed as the “sparse radar image”. For the output branches, the proposed network contains separate branches for small obstacles and big obstacles, respectively. The validation of the proposed network with state-of-the-art baseline algorithm is performed on the Nuscenes public dataset. Additionally, a detailed parameter analysis is performed with several variants of the RVNet. The experimental results show that the proposed network is better than baseline algorithms in varying environmental conditions.
What problem does this paper attempt to address?