CaFNet: A Confidence-Driven Framework for Radar Camera Depth Estimation

Huawei Sun,Hao Feng,Julius Ott,Lorenzo Servadei,Robert Wille
2024-08-30
Abstract:Depth estimation is critical in autonomous driving for interpreting 3D scenes accurately. Recently, radar-camera depth estimation has become of sufficient interest due to the robustness and low-cost properties of radar. Thus, this paper introduces a two-stage, end-to-end trainable Confidence-aware Fusion Net (CaFNet) for dense depth estimation, combining RGB imagery with sparse and noisy radar point cloud data. The first stage addresses radar-specific challenges, such as ambiguous elevation and noisy measurements, by predicting a radar confidence map and a preliminary coarse depth map. A novel approach is presented for generating the ground truth for the confidence map, which involves associating each radar point with its corresponding object to identify potential projection surfaces. These maps, together with the initial radar input, are processed by a second encoder. For the final depth estimation, we innovate a confidence-aware gated fusion mechanism to integrate radar and image features effectively, thereby enhancing the reliability of the depth map by filtering out radar noise. Our methodology, evaluated on the nuScenes dataset, demonstrates superior performance, improving upon the current leading model by 3.2% in Mean Absolute Error (MAE) and 2.7% in Root Mean Square Error (RMSE). Code: <a class="link-external link-https" href="https://github.com/harborsarah/CaFNet" rel="external noopener nofollow">this https URL</a>
Computer Vision and Pattern Recognition,Artificial Intelligence,Signal Processing
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to accurately perform depth estimation in autonomous driving. Specifically, this research aims to combine RGB images and radar point cloud data to generate dense depth maps. The following is a specific description of the problem: 1. **The importance of depth estimation in autonomous driving**: - Accurate 3D scene understanding is crucial for autonomous driving, and depth estimation is a key task in achieving this goal. 2. **Limitations of existing methods**: - Monocular images (RGB images) can provide rich visual information, but lack depth cues, resulting in limited performance in depth estimation. - LiDAR can provide high - quality depth maps, but is costly and sensitive to lighting and weather conditions. - Radar sensors are low - cost and adaptable, but their point cloud data is sparse and noisy, making it difficult to be directly used for depth estimation. 3. **The method proposed in this paper**: - To solve the above problems, this paper proposes a two - stage, end - to - end trainable confidence - driven fusion network (CaFNet) to combine RGB images and sparse and noisy radar point cloud data for dense depth estimation. 4. **Specific challenges and solutions**: - **Radar - specific challenges**: Radar point cloud data is sparse and noisy, height information is unclear, and the multipath effect introduces a large number of false targets. - **Solutions**: - **First stage**: Deal with radar - specific challenges by predicting the radar confidence map and the preliminary rough depth map. - **Second stage**: Introduce the confidence - aware gated fusion mechanism (CaGF) to effectively integrate radar and image features and improve the reliability of the depth map. 5. **Innovations**: - A new method for generating the ground truth of the radar confidence map is proposed, which enhances the reliability of confidence generation. - Utilize the confidence - aware gated fusion technique (CaGF) to reduce the spread of wrong data and improve the overall depth estimation performance. 6. **Experimental results**: - Evaluated on the nuScenes dataset, the results show that CaFNet improves by 3.2% and 2.7% respectively in terms of mean absolute error (MAE) and root mean square error (RMSE) compared to the current leading model. In summary, this paper aims to solve the challenges faced by depth estimation in autonomous driving by combining RGB images and radar point cloud data and using the confidence - driven fusion network (CaFNet), thereby improving the accuracy and reliability of depth estimation.