Deep Learning-Based Pedestrian Detection Using RGB Images and Sparse LiDAR Point Clouds

Haoran Xu,Shuo Huang,Yixin Yang,Xiaodao Chen,Shiyan Hu
DOI: https://doi.org/10.1109/tii.2024.3353845
IF: 12.3
2024-01-01
IEEE Transactions on Industrial Informatics
Abstract:One of the fundamental tasks in autonomous driving is environment perception for pedestrian detection, where the fused pedestrian detection using camera and light detection and ranging (LiDAR) information imposes challenges since the data alignment, compensation, and fusion between different data modes are challenging and the simultaneous acquisition of data from two different modalities also increases the difficulty. This work addresses the above challenges from both of the hardware and software dimensions. First, a multimodal pedestrian data acquisition platform is designed and constructed using an RGB camera, sparse LiDAR, and data processing module including hardware connection and deployment, sensor distortion correction and joint calibration, and data acquisition synchronization. Pedestrian data from multiple scenes are then collected using this platform to produce and form a dedicated multimodal pedestrian detection dataset. Further, a two-branch multimodal multilevel fusion pedestrian detection network (MM-Net) is proposed, which includes a two-branch feature extraction module and a feature-level data fusion module. Extensive experiments are performed on the multimodal pedestrian detection dataset and KITTI dataset for the comparison with the existing models. The experimental results demonstrate the superior performance of MM-Net.
What problem does this paper attempt to address?