Abstract:Achieving the accurate perception of occluded objects for autonomous vehicles is a challenging problem. Human vision can always quickly locate important object regions in complex external scenes, while other regions are only roughly analysed or ignored, defined as the visual attention mechanism. However, the perception system of autonomous vehicles cannot know which part of the point cloud is in the region of interest. Therefore, it is meaningful to explore how to use the visual attention mechanism in the perception system of autonomous driving. In this paper, we propose the model of the spatial attention frustum to solve object occlusion in 3D object detection. The spatial attention frustum can suppress unimportant features and allocate limited neural computing resources to critical parts of the scene, thereby providing greater relevance and easier processing for higher-level perceptual reasoning tasks. To ensure that our method maintains good reasoning ability when faced with occluded objects with only a partial structure, we propose a local feature aggregation module to capture more complex local features of the point cloud. Finally, we discuss the projection constraint relationship between the 3D bounding box and the 2D bounding box and propose a joint anchor box projection loss function, which will help to improve the overall performance of our method. The results of the KITTI dataset show that our proposed method can effectively improve the detection accuracy of occluded objects. Our method achieves 89.46%, 79.91% and 75.53% detection accuracy in the easy, moderate, and hard difficulty levels of the car category, and achieves a 6.97% performance improvement especially in the hard category with a high degree of occlusion. Our one-stage method does not need to rely on another refining stage, comparable to the accuracy of the two-stage method.

3D Bounding Box Estimation for Autonomous Vehicles by Cascaded Geometric Constraints and Depurated 2D Detections Using 3D Results

Leveraging Front and Side Cues for Occlusion Handling in Monocular 3D Object Detection

A Multi-view 3D Vehicle Detection Method Based On Novel 3D Proposal Generation Method

Monocular 3-D Vehicle Detection Using a Cascade Network for Autonomous Driving

Image Guidance Based 3D Vehicle Detection in Traffic Scene.

6DoF-3D: Efficient and accurate 3D object detection using six degrees-of-freedom for autonomous driving

Monocular 3D object detection via estimation of paired keypoints for autonomous driving

3D Detection for Occluded Vehicles From Point Clouds

3D Vehicle Detection Using Cheap LiDAR and Camera Sensors.

F-PVNet: Frustum-Level 3-D Object Detection on Point–Voxel Feature Representation for Autonomous Driving

Spatial Attention Frustum: A 3D Object Detection Method Focusing on Occluded Objects

Monocular 3D object detection using dual quadric for autonomous driving

Real-Time 3D Object Detection From Point Cloud Through Foreground Segmentation

An Efficient Wide-Range Pseudo-3D Vehicle Detection Using A Single Camera

DETR3D: 3D Object Detection from Multi-view Images via 3D-to-2D Queries

Stereo R-CNN based 3D Object Detection for Autonomous Driving

KPV3D: Enhancing Multiview 3-D Vehicle Detection With 2-D Keypoint Priors

Accelerating Point-Voxel Representation of 3-D Object Detection for Automatic Driving

Learning 2D to 3D Lifting for Object Detection in 3D for Autonomous Vehicles

Multi-view 3D Object Detection Network for Autonomous Driving