Moving Object Segmentation Network for Multi-View Fusion of Vehicle-Mounted LiDAR

Jianwang Gan,Guoying Zhang,Yijin Xiong,Yongqi Gan
DOI: https://doi.org/10.1109/jsen.2024.3466271
IF: 4.3
2024-01-01
IEEE Sensors Journal
Abstract:Accurate motion recognition of objects in the surrounding environment using LiDAR sensors is crucial for the safe operation of driverless vehicles. Recent studies have extensively utilized 3D data or 2D projections to characterize LiDAR point cloud sequences for efficient moving target recognition. However, these approaches still face challenges in accurately identifying slow-moving or small-sized moving objects. To address these limitations, we propose a moving target segmentation network. Our network comprises three key components: the point view branches that preserve complete information, the bird’s eye view (BEV) branches dedicated to motion feature construction, and the range view (RV) branches that compensate for semantic features. In the initial segment of the BEV branch, we introduce a differential attention-based dynamic information enhancement module (DIEM). DIME effectively enhances the dynamic information of both slow and small moving targets by integrating dynamic features between the current frame and its neighboring frames. During the BEV and RV encoding stage, we employ residual blocks to capture semantic features effectively. To expedite global feature capture, we propose a U-shaped pyramid pooling module. In the decoder, we present a multi-scale dynamic information calibration module (MDICM) designed to calibrate and fuse features from different layers. Finally, the BEV and RV branch output features are back-projected into the 3D space and fused with the point features to achieve motion target segmentation. Our method is validated on the public datasets of SemanticKITTI and Apollo, achieving advanced accuracy while meeting real-time requirements.
What problem does this paper attempt to address?