Mask-VRDet: A Robust Riverway Panoptic Perception Model Based on Dual Graph Fusion of Vision and 4D Mmwave Radar

Runwei Guan,Shanliang Yao,Lulu Liu,Xiaohui Zhu,Ka Lok Man,Yong Yue,Jeremy Smith,Eng Gee,Yutao Yue
DOI: https://doi.org/10.1016/j.robot.2023.104572
IF: 3.7
2023-01-01
Robotics and Autonomous Systems
Abstract:With the development of Unmanned Surface Vehicles (USVs), the perception of inland waterways has become significant to autonomous navigation. RGB cameras can capture images with rich semantic features, but they would fail in adverse weather and at night. As a perception sensor that has initially emerged in recent years, 4D millimeter-wave radar (4D mmWave radar) can work in all weather and has more abundant point-cloud features than ordinary radar, but it also suffers from water-surface clutter seriously. Furthermore, the shape and outline of dense point cloud captured by 4D mmWave radar are irregular. CNN-based neural networks treat features as 2D rectangle grids, which excessively favor image modality and are unfriendly to radar modality. Therefore, we transform both features of image and radar into non-Euclidean space as graph structures. In this paper, we focus on robust panoptic perception in inland waterways. Firstly, we propose the first Clutter-Point-Removal (CPR) algorithm for 4D mmWave radar, removing water-surface clutter and improving the recall of radar targets. Secondly, we propose a high-performance panoptic perception model based on the graph neural network called Mask-VRDet, fusing features of vision and radar to simultaneously perform object detection and semantic segmentation. To the best of our knowledge, Mask-VRDet is the first riverway panoptic perception model based on vision-radar graphical fusion. It outperforms other single-modal and fusion models, and achieves state-of-the-art performance on our collected dataset. We release our code at https://github.com/GuanRunwei/Mask-VRDet-Official.
What problem does this paper attempt to address?