Abstract:Object detection forms the foundation of safe autonomous vehicle (AV) operation. LiDAR and camera are both widely used detection devices, yet they each come with their unique advantages and drawbacks. For instance, LiDAR sensors face challenges such as obstacle occlusion and long-range object detection when applied to 3-D object recognition. On the other hand, cameras are significantly affected by variations in lighting and weather conditions, and they struggle to provide precise depth information. Hence, multisensor fusion is frequently employed to enhance both the accuracy and robustness of object detection. Prominent issues associated with end-to-end fusion include feature misalignment and suboptimal training strategies, while the challenge for the sequential fusion architecture lies in its inability to fully tap into the capabilities of high-density images to enhance point cloud data, especially when dealing with information sparsity at extended ranges. To address these challenges, we present a dense sequential fusion (DSF) framework specifically designed to fuse camera and LiDAR sensor data. The primary goal is to enhance the accuracy and robustness of 3-D object detection, particularly for distant objects. First, we developed a model for augmenting foreground points, specifically targeting sparse points associated with far-range objects. Second, a foreground points refinement technique was implemented to filter long-tail points generated by images. This refinement process has the capability to improve the object’s distinctiveness, especially when dealing with an abundance of edge points while also supplying high-resolution raw and pseudo foreground points. Finally, voxel-based LiDAR 3-D detection methods were employed to detect 3-D objects utilizing the high-resolution raw and pseudo point clouds. The experimental studies were conducted using the KITTI dataset. The results showed that the proposed method improved 3-D mAP by 2.59% compared with PointPillars and 1.27% average precision (AP) for car hard-level detection compared with SECOND. Furthermore, it improved the bird’s eye view (BEV) AP for far-range car detection by more than 10%.

Future Does Matter: Boosting 3D Object Detection with Temporal Motion Estimation in Point Cloud Sequences

LiDAR-Based 3D Temporal Object Detection via Motion-Aware LiDAR Feature Fusion

An LSTM Approach to Temporal 3D Object Detection in LiDAR Point Clouds

Anchor-Based Transformer for Temporal LiDAR 3D Object Detection

MoDAR: Using Motion Forecasting for 3D Object Detection in Point Cloud Sequences

Frame Fusion with Vehicle Motion Prediction for 3D Object Detection

Efficient Spatial-Temporal Information Fusion for LiDAR-Based 3D Moving Object Segmentation

LEF: Late-to-Early Temporal Fusion for LiDAR 3D Object Detection

Explore the LiDAR-Camera Dynamic Adjustment Fusion for 3D Object Detection

3D Multi-object Detection and Tracking with Sparse Stationary LiDAR

Influence of Camera-LiDAR Configuration on 3D Object Detection for Autonomous Driving

Enhancing 3D object detection through multi-modal fusion for cooperative perception

3D Point Cloud Object Detection Algorithm Based on Temporal Information Fusion and Uncertainty Estimation

Dense Sequential Fusion: Point Cloud Enhancement Using Foreground Mask Guidance for Multimodal 3-D Object Detection

Query-based Temporal Fusion with Explicit Motion for 3D Object Detection

Spatial Information Enhancement with Multi-Scale Feature Aggregation for Long-Range Object and Small Reflective Area Object Detection from Point Cloud

Semantically Enhanced Multi-Object Detection and Tracking for Autonomous Vehicles.

LiDAR-based 3D Video Object Detection with Foreground Context Modeling and Spatiotemporal Graph Reasoning

3-D Objects Detection and Tracking Using Solid-State LiDAR and RGB Camera

MSF: Motion-guided Sequential Fusion for Efficient 3D Object Detection from Point Cloud Sequences

3D Dynamic Multi-target Detection Algorithm Based on Cross-view Feature Fusion