Cross-Domain Spatial Matching for Camera and Radar Sensor Data Fusion in Autonomous Vehicle Perception System

Daniel Dworak,Mateusz Komorkiewicz,Paweł Skruch,Jerzy Baranowski

2024-04-25

Abstract:In this paper, we propose a novel approach to address the problem of camera and radar sensor fusion for 3D object detection in autonomous vehicle perception systems. Our approach builds on recent advances in deep learning and leverages the strengths of both sensors to improve object detection performance. Precisely, we extract 2D features from camera images using a state-of-the-art deep learning architecture and then apply a novel Cross-Domain Spatial Matching (CDSM) transformation method to convert these features into 3D space. We then fuse them with extracted radar data using a complementary fusion strategy to produce a final 3D object representation. To demonstrate the effectiveness of our approach, we evaluate it on the NuScenes dataset. We compare our approach to both single-sensor performance and current state-of-the-art fusion methods. Our results show that the proposed approach achieves superior performance over single-sensor solutions and could directly compete with other top-level fusion methods.

Computer Vision and Pattern Recognition

What problem does this paper attempt to address?

The problem that this paper attempts to solve is: **How to effectively fuse camera and radar sensor data in the autonomous vehicle perception system to improve the performance of 3D object detection**. Specifically, the paper proposes a new low - level fusion method for fusing data from camera images and radar point clouds. Through this method, the advantages of both sensors can be fully utilized, thereby improving the accuracy and robustness of object detection. The following are the main contributions of the paper: 1. **New low - level fusion method**: A projection - less method based on tensor - orientation matching, called **Cross - Domain Spatial Matching (CDSM)**, is proposed for fusing camera and radar data in the neural network structure. 2. **Lightweight solution**: This method is not only competitive but also computationally efficient, and can reduce the consumption of computational resources while maintaining high performance. 3. **Multi - view processing architecture**: A multi - view processing architecture is adopted, which uses a single - stage network to process camera images and radar point cloud data respectively, and aligns and fuses these feature maps in 3D space through the CDSM module. 4. **Experimental verification**: Experiments were carried out on the NuScenes dataset to verify the effectiveness of this method, and it was compared with existing single - sensor methods and other top - level fusion methods, showing its superior performance. Through these innovations, the paper aims to provide a more efficient and reliable object detection method for the autonomous vehicle perception system, especially in complex and dynamic traffic environments.

Cross-Domain Spatial Matching for Camera and Radar Sensor Data Fusion in Autonomous Vehicle Perception System

Radar and Camera Fusion for Multi-Task Sensing in Autonomous Driving

Bridging the View Disparity Between Radar and Camera Features for Multi-Modal Fusion 3D Object Detection

A Survey of Deep Learning Based Radar and Vision Fusion for 3D Object Detection in Autonomous Driving

SparseFusion3D: Sparse Sensor Fusion for 3D object detection by Radar and Camera in Environmental Perception

ClusterFusion: Leveraging Radar Spatial Features for Radar-Camera 3D Object Detection in Autonomous Vehicles

Multi-Modal and Multi-Scale Fusion 3D Object Detection of 4D Radar and LiDAR for Autonomous Driving

CenterFusion: Center-based Radar and Camera Fusion for 3D Object Detection

Radar-Camera Sensor Fusion for Joint Object Detection and Distance Estimation in Autonomous Vehicles

BEV-Radar: Bidirectional Radar-Camera Fusion for 3D Object Detection

Object Detection Using Multi-Sensor Fusion Based on Deep Learning

3D-CVF: Generating Joint Camera and LiDAR Features Using Cross-view Spatial Feature Fusion for 3D Object Detection

MVFusion: Multi-View 3D Object Detection with Semantic-aligned Radar and Camera Fusion

Radar Voxel Fusion for 3D Object Detection

CramNet: Camera-Radar Fusion with Ray-Constrained Cross-Attention for Robust 3D Object Detection

A Multi-scale Fusion Obstacle Detection Algorithm for Autonomous Driving Based on Camera and Radar

HVDetFusion: A Simple and Robust Camera-Radar Fusion Framework

RCBEVDet++: Toward High-accuracy Radar-Camera Fusion 3D Perception Network

CR3DT: Camera-RADAR Fusion for 3D Detection and Tracking

Enhancing 3D object detection through multi-modal fusion for cooperative perception

Influence of Camera-LiDAR Configuration on 3D Object Detection for Autonomous Driving