Abstract:LiDAR-based 3D object detection is of paramount importance for autonomous driving. Recent trends show a remarkable improvement for bird's-eye-view (BEV) based and point-based methods as they demonstrate superior performance compared to range-view counterparts. This paper presents an insight that leverages range-view representation to enhance 3D points for accurate 3D object detection. Specifically, we introduce a Redemption from Range-view Module (R2M), a plug-and-play approach for 3D surface texture enhancement from the 2D range view to the 3D point view. R2M comprises BasicBlock for 2D feature extraction, Hierarchical-dilated (HD) Meta Kernel for expanding the 3D receptive field, and Feature Points Redemption (FPR) for recovering 3D surface texture information. R2M can be seamlessly integrated into state-of-the-art LiDAR-based 3D object detectors as preprocessing and achieve appealing improvement, e.g., 1.39%, 1.67%, and 1.97% mAP improvement on easy, moderate, and hard difficulty level of KITTI val set, respectively. Based on R2M, we further propose R2Detector (R2Det) with the Synchronous-Grid RoI Pooling for accurate box refinement. R2Det outperforms existing range-view-based methods by a significant margin on both the KITTI benchmark and the Waymo Open Dataset. Codes will be made publicly available.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is in range - view - based 3D object detection, how to effectively utilize the dense semantic information in 2D range images to enhance 3D point cloud data, thereby improving the accuracy of 3D object detection. Specifically, the paper points out that current range - view - based methods, when processing 2D range images, are efficient but ignore the loss of 3D surface texture information, which leads to a decline in detection accuracy. Therefore, the paper proposes a module named Redemption from Range - view Module (R2M). By extracting 3D surface texture information from 2D range images and converting it back to the 3D point cloud view, it restores the surface texture information of 3D objects, thereby enhancing the performance of 3D object detection. ### Main Contributions 1. **Proposing an efficient plug - in R2M module**: This module enhances 2D feature extraction by expanding the 3D receptive field, and provides an elegant strategy to solve the problem of 3D surface texture information loss, while exploring point - level semantic information in range images. 2. **Introducing R2Det**: This is a new method that overcomes the limitations of range - view representations and achieves more accurate 3D object detection. 3. **Extensive experiments on the KITTI and Waymo Open Datasets**: The results show that the proposed R2M module can significantly improve the performance of existing 3D object detectors, and R2Det achieves state - of - the - art performance on both datasets. ### Method Overview 1. **R2M module**: - **Range - view feature extraction**: Use BasicBlock and Hierarchical - dilated (HD) Meta Kernel to extract features in 2D range images and expand the 3D receptive field. - **Feature Point Redemption (FPR)**: Convert 2D feature points back to the 3D point cloud view to restore 3D surface texture information. 2. **3D voxel feature extraction and proposal generation**: - Use 3D voxel CNN to segment 3D feature points into voxels and generate 3D proposals. 3. **Candidate box refinement**: - **Global scene representation**: Down - sample the point cloud through the Furthest Point Sampling algorithm to generate a representative point set. - **Local RoI feature extraction**: Use the S - Grid RoI Pooling module to aggregate RoI features through different grid sizes and sampling radii. - **Detection head**: Predict the confidence score of each region proposal through a Multi - Layer Perceptron (MLP) and perform coordinate regression. ### Experimental Results - **KITTI dataset**: R2Det achieves state - of - the - art performance on both the KITTI test set and the validation set, especially reaching a 3D AP of 77.84% at the difficult level. - **Waymo Open Dataset**: R2Det achieves the best mAP performance at both difficulty levels in the vehicle category, which are 78.4% and 70.2% respectively. ### Conclusion Through the R2M module and the R2Det framework, the paper successfully solves the problem of 3D surface texture information loss in range - view - based 3D object detection and significantly improves the detection performance. The experimental results verify the effectiveness and superiority of this method.

R2Det: Redemption from Range-view for Accurate 3D Object Detection

Reinforcing Lidar-Based 3d Object Detection With Rgb And 3d Information

RangeRCNN: Towards Fast and Accurate 3D Object Detection with Range Image Representation

Fully Convolutional One-Stage 3D Object Detection on LiDAR Range Images

A Multi-view 3D Vehicle Detection Method Based On Novel 3D Proposal Generation Method

Ret3D: Rethinking Object Relations for Efficient 3D Object Detection in Driving Scenes

RIDE: Boosting 3D Object Detection for LiDAR Point Clouds via Rotation-Invariant Analysis

A Versatile Multi-View Framework for LiDAR-based 3D Object Detection with Guidance from Panoptic Segmentation

BADet: Boundary-Aware 3D Object Detection from Point Clouds

What Matters in Range View 3D Object Detection

BEVDet: High-performance Multi-camera 3D Object Detection in Bird-Eye-View

RI-Fusion: 3D Object Detection Using Enhanced Point Features With Range-Image Fusion for Autonomous Driving.

Uplifting Range-View-based 3D Semantic Segmentation in Real-Time with Multi-Sensor Fusion

BEVHeight++: Toward Robust Visual Centric 3D Object Detection

Enhancing Grid-Based 3D Object Detection in Autonomous Driving with Improved Dimensionality Reduction.

An Empirical Analysis of Range for 3D Object Detection

From Multi-View to Hollow-3D: Hallucinated Hollow-3D R-CNN for 3D Object Detection

3D Object Detector: A Multiscale Region Proposal Network Based on Autonomous Driving.

Stereo RGB and Deeper LIDAR-Based Network for 3D Object Detection in Autonomous Driving.

PV-RCNN++: Semantical Point-Voxel Feature Interaction for 3D Object Detection

Three-Attention Mechanisms for One-Stage 3-D Object Detection Based on LiDAR and Camera