Abstract:3D object detection is an essential task for computer vision applications in autonomous vehicles and robotics. However, models often struggle to quantify detection reliability, leading to poor performance on unfamiliar scenes. We introduce a framework for quantifying uncertainty in 3D object detection by leveraging an evidential learning loss on Bird's Eye View representations in the 3D detector. These uncertainty estimates require minimal computational overhead and are generalizable across different architectures. We demonstrate both the efficacy and importance of these uncertainty estimates on identifying out-of-distribution scenes, poorly localized objects, and missing (false negative) detections; our framework consistently improves over baselines by 10-20% on average. Finally, we integrate this suite of tasks into a system where a 3D object detector auto-labels driving scenes and our uncertainty estimates verify label correctness before the labels are used to train a second model. Here, our uncertainty-driven verification results in a 1% improvement in mAP and a 1-2% improvement in NDS.

What problem does this paper attempt to address?

The main problem that this paper attempts to solve is: in 3D object detection tasks, existing models have difficulty quantifying the reliability of detection results, especially performing poorly when dealing with unfamiliar scenes. Specifically, the authors introduced an evidential - learning - based method to efficiently estimate the uncertainty in 3D object detection. ### Background and Problem Description of the Paper 3D object detection is an important task in the field of computer vision, especially widely used in autonomous driving and robotics. However, current models face challenges in quantifying the reliability of detection results. In particular, when facing unseen scenes, the performance drops significantly. This uncertainty is mainly reflected in the following aspects: 1. **Identifying out - of - distribution (OOD) scenes**: that is, when the detection model encounters new scenes different from the training data distribution, how to effectively identify these abnormal scenes. 2. **Location quality assessment**: for the predicted bounding boxes, how to assess the accuracy of their location. 3. **Missed detection**: how to find objects that the model fails to detect (i.e., the false - negative problem). ### Proposed Solution To solve the above problems, the authors proposed an uncertainty estimation framework based on evidential learning. This framework is implemented in the following ways: - **Model architecture**: Replace the standard heatmap head with an evidential deep learning (EDL) head, which not only predicts the probability of object existence but also simultaneously outputs the uncertainty parameters \(\alpha_i\) and \(\beta_i\) associated with each BEV (bird - eye - view) cell. \[ P(y_j = 1|x):=\frac{\alpha_j}{\alpha_j+\beta_j}, \quad U(x):=\frac{1}{\alpha_j+\beta_j} \] - **Loss function**: Designed a loss function that combines Bayesian risk and Gaussian focal loss (GFL) to deal with the class - imbalance problem and better focus on hard - to - classify samples. \[ L_{EDL}^i:=\sum_{j = 1}^{C}\left[y_{ij}(\psi(\alpha_{ij}+\beta_{ij})-\psi(\alpha_{ij}))\cdot(1-\frac{\alpha_{ij}}{\alpha_{ij}+\beta_{ij}})^{\gamma}+(1 - y_{ij})(\psi(\alpha_{ij}+\beta_{ij})-\psi(\beta_{ij}))\cdot(\frac{\alpha_{ij}}{\alpha_{ij}+\beta_{ij}})^{\gamma}\cdot(1-\hat{y}_{ij})^{\eta}\right] \] - **Regularization term**: Introduced a regularization term based on Kullback - Leibler divergence to prevent the model from making over - confident predictions. \[ L_{Reg}^i:=\sum_{j = 1}^{C}\left[(\tilde{\alpha}_{ij}-1)(\psi(\tilde{\alpha}_{ij})-\psi(\tilde{\alpha}_{ij}+\tilde{\beta}_{ij}))+(\tilde{\beta}_{ij}-1)(\psi(\tilde{\beta}_{ij})-\psi(\tilde{\alpha}_{ij}+\tilde{\beta}_{ij}))-\log B(\tilde{\alpha}_{ij},\tilde{\beta}_{ij})\right] \] ### Experimental Verification The authors verified through multiple experiments.

Uncertainty Estimation for 3D Object Detection via Evidential Learning

EvCenterNet: Uncertainty Estimation for Object Detection using Evidential Learning

Efficient Uncertainty Estimation for Monocular 3D Object Detection in Autonomous Driving

Leveraging Front and Side Cues for Occlusion Handling in Monocular 3D Object Detection

MEDL-U: Uncertainty-aware 3D Automatic Annotation based on Evidential Deep Learning

An Uncertainty Estimation Framework for Probabilistic Object Detection

Harnessing Uncertainty-aware Bounding Boxes for Unsupervised 3D Object Detection

Augmenting 3-D Object Detection Through Data Uncertainty-Driven Auxiliary Framework

Labels Are Not Perfect: Inferring Spatial Uncertainty in Object Detection

Uncertainty-Aware AB3DMOT by Variational 3D Object Detection

CertainNet: Sampling-free Uncertainty Estimation for Object Detection

Feature Decoupling and Uncertainty Estimation for 3D Object Detection

Uncertainty-Aware Self-Improving Framework for Depth Estimation

Uncertainty Calibration and its Application to Object Detection

Exploiting Label Uncertainty for Enhanced 3D Object Detection From Point Clouds

Uncertainty Quantification for Bird's Eye View Semantic Segmentation: Methods and Benchmarks

Estimating 3D Uncertainty Field: Quantifying Uncertainty for Neural Radiance Fields

Efficient Multi-task Uncertainties for Joint Semantic Segmentation and Monocular Depth Estimation

Uncertainty Estimation and Out-of-Distribution Detection for LiDAR Scene Semantic Segmentation

Calibrated Perception Uncertainty Across Objects and Regions in Bird's-Eye-View

MonoAux: Fully Exploiting Auxiliary Information and Uncertainty for Monocular 3D Object Detection