RaViDeep: Target Detection Based on Deep Fusion of Radar and Vision in Berthing Scenarios

Yuying Song,Jingxuan Wu,Wei Wu,Chunyi Song,Zhiwei Xu,Ming Zhang
DOI: https://doi.org/10.1109/tiv.2024.3432605
IF: 8.2
2024-01-01
IEEE Transactions on Intelligent Vehicles
Abstract:Current radar-vision fusion techniques struggle to fully leverage the complementary data from sparse radar points and depth-deficient images, impacting their overall effectiveness. This paper proposes RaViDeep, a novel target detection method based on deep fusion of millimeter-wave radar and monocular image. Initially, a Semantic-based Point Cloud Registration (SPCR) module combines image semantics, radar hierarchical features,and Doppler data to enhance target spatial representation, yielding dense and stable semantic radar points. Subsequently, a Radar-Guided Depth Estimation (RGDE) module with Gaussian enhancement is introduced, utilizing accurate radar depth measurements to guide image depth estimation. This approach fosters a comprehensive scene understanding and effectively reduces measurement and calibration errors. Finally, a pseudo point cloud, generated by the estimated image depth, is integrated with the semantic radar points to facilitate target detection. Tailored for autonomous berthing tasks in wharf scenarios, a novel Non-Occupied Overlap (NOO) metric is developed. Experimental results demonstrate that RaViDeep surpasses state-of-the-art methods, achieving a 12.10% improvement in the NOO metric and a 13.90% improvement in recall. These results verify the superior performance and robustness of our method in practical wharf scenarios. The code and dataset are available at https://github.com/kagurua/RaViDeep .
What problem does this paper attempt to address?