Abstract:Three-dimensional object detection is a pivotal research topic in computer vision, aiming to identify and locate objects in three-dimensional space. It has wide applications in various fields such as geoscience, autonomous driving, and drone navigation. The rapid development of deep learning techniques has led to significant advancements in 3D object detection. However, with the increasing complexity of applications, 3D object detection faces a series of challenges such as data imbalance and the effectiveness of network models. Specifically, in an experiment, our investigation revealed a notable discrepancy in the LiDAR reflection intensity within a point cloud scene, with stronger intensities observed in proximity and weaker intensities observed at a distance. Furthermore, we have also noted a substantial disparity in the number of foreground points compared to the number of background points. Especially in 3D object detection, the foreground point is more important than the background point, but it is usually downsampled without discrimination in the subsequent processing. With the objective of tackling these challenges, we work from both data and network perspectives, designing a feature alignment filtering algorithm and a two-stage 3D object detection network. Firstly, in order to achieve feature alignment, we introduce a correction equation to decouple the relationship between distance and intensity and eliminate the attenuation effect of intensity caused by distance. Then, a background point filtering algorithm is designed by using the aligned data to alleviate the problem of data imbalance. At the same time, we take into consideration the fact that the accuracy of semantic segmentation plays a crucial role in 3D object detection. Therefore, we propose a two-stage deep learning network that integrates spatial and spectral information, in which a feature fusion branch is designed and embedded in the semantic segmentation backbone. Through a series of experiments on the KITTI dataset, it is proven that the proposed method achieves the following average precision (AP_R40) values for easy, moderate, and hard difficulties, respectively: car (Iou 0.7)—89.23%, 80.14%, and 77.89%; pedestrian (Iou 0.5)—52.32%, 45.47%, and 38.78%; and cyclist (Iou 0.5)—76.41%, 61.92%, and 56.39%. By emphasizing both data quality optimization and efficient network architecture, the performance of the proposed method is made comparable to other state-of-the-art methods.

Three-Dimensional Pedestrian Detection by Fusing Image Semantics and Point Cloud Spatial Visibility Features

Fusing LiDAR and Radar with Pillars Attention for 3D Object Detection

SemanticVoxels: Sequential Fusion for 3D Pedestrian Detection using LiDAR Point Cloud and Semantic Segmentation

Three-Dimensional Point Cloud Object Detection Based on Feature Fusion and Enhancement

MS23D: A 3D object detection method using multi-scale semantic feature points to construct 3D feature layer

3D object detection based on fusion of image and point cloud in autonomous driving traffic scenarios

Fusion-attention network using dense scale-invariant feature transform flow image and point cloud for 3D pedestrian detection

3D Object Detection for Point Cloud in Virtual Driving Environment

Density Enhancement-Based Long-Range Pedestrian Detection Using 3-D Range Data

3D Instance Segmentation and Object Detection Framework Based on the Fusion of Lidar Remote Sensing and Optical Image Sensing

A Pedestrian Detection Algorithm Based on Score Fusion for Multi-LiDAR Systems

3D Object Detection and High-Resolution Traffic Parameters Extraction Using Low-Resolution LiDAR Data

FusionPainting: Multimodal Fusion with Adaptive Attention for 3D Object Detection

Combined Edge- and Stixel-based Object Detection in 3D Point Cloud

3D Vehicle Detection Using Multi-Level Fusion From Point Clouds and Images

Dense Sequential Fusion: Point Cloud Enhancement Using Foreground Mask Guidance for Multimodal 3-D Object Detection

CFPC: The Curbed Fake Point Collector to Pseudo-LiDAR-Based 3D Object Detection for Autonomous Vehicles

MCP: a protocol for coordination and temporal synchronization in multimedia collaborative applications

PanoNet3D: Combining Semantic and Geometric Understanding for LiDARPoint Cloud Detection

Equal Emphasis on Data and Network: A Two-Stage 3D Point Cloud Object Detection Algorithm with Feature Alignment