Abstract:The semantic segmentation of outdoor LiDAR point clouds is one of the gigantic fields in the large-scale driving scenario. However, the performances of the state-of-the-art methods are unsatisfactory caused by the intrinsic limitations of the outdoor point clouds with excessive distribution of sparsity and imbalanced distribution of density, both of which become great challenges for a precise segmentation of LiDAR point clouds. To tackle the aforementioned intrinsic problems of point clouds with an improved segmentation performance, we propose a brand new attention-based feature interaction module called Voxel Slicing and Interaction based Attention Module (V-SIAM) that is integrated into the segmentation networks. Our V-SIAM is composed of a Voxel Slicing and Interaction Module (V-SIM) followed by a Voxel Attention Module (VAM), where the V-SIM is utilized to significantly reduce the negative impact caused by the imbalanced point density, in terms of enhancing the interaction of the voxel feature by a novel feature slicing, leading to the enriched voxel feature details of the point clouds. Moreover, the VAM is utilized to reduce the negative effect caused by the excessive sparsity of the point clouds, in terms of recalibration among voxel features via an innovative way, leading to the extraction of adaptive and self-enhanced voxel features. Besides the V-SIAM, a Multi-scale Voxel Feature Extractor (MsVFE), utilized as the preprocessing module of the segmentation networks, is proposed to further alleviate the negative influence caused by the excessive sparsity of the point clouds, realized by fusing the multi-scale voxel information of the sparse point clouds, leading to extraction of more detailed features of the point clouds. Our experimental results show that the proposed methods achieve 73.7% mIoU on the large-scale SemanticKITTI benchmark, outperforming the state-of-the-art PVKD and 2DPass by +1.3% mIoU and +0.8% mIoU, respectively. Moreover, our proposed MsVFE and V-SIAM achieve the state-of-the-art performance on the Toronto3D dataset and KITTI-360 dataset.

RPV-CASNet: range-point-voxel integration with channel self-attention network for lidar point cloud segmentation

RPVNet: A Deep and Efficient Range-Point-Voxel Fusion Network for LiDAR Point Cloud Segmentation

Beyond single receptive field: A receptive field fusion-and-stratification network for airborne laser scanning point cloud classification

Multi-modal LiDAR Point Cloud Semantic Segmentation with Salience Refinement and Boundary Perception

Flexible asymmetric convolutional attention network for LiDAR semantic

DFAMNet: dual fusion attention multi-modal network for semantic segmentation on LiDAR point clouds

LCPR: A Multi-Scale Attention-Based LiDAR-Camera Fusion Network for Place Recognition

LiDAR-Camera Panoptic Segmentation via Geometry-Consistent and Semantic-Aware Alignment

Multi-scale Network with Attentional Multi-resolution Fusion for Point Cloud Semantic Segmentation

MsVFE and V-SIAM: Attention-based multi-scale feature interaction and fusion for outdoor LiDAR semantic segmentation

PV-SSD: A Multi-Modal Point Cloud Feature Fusion Method for Projection Features and Variable Receptive Field Voxel Features

PVAFN: Point-Voxel Attention Fusion Network with Multi-Pooling Enhancing for 3D Object Detection

Semantic Segmentation of Urban Airborne LiDAR Point Clouds Based on Fusion Attention Mechanism and Multi-Scale Features

Multilevel intuitive attention neural network for airborne LiDAR point cloud semantic segmentation

MVG-Net: LiDAR Point Cloud Semantic Segmentation Network Integrating Multi-View Images

VPFNet: Improving 3D Object Detection with Virtual Point based LiDAR and Stereo Data Fusion

FPS-Net: A Convolutional Fusion Network for Large-Scale LiDAR Point Cloud Segmentation

PI-RCNN: An Efficient Multi-sensor 3D Object Detector with Point-based Attentive Cont-conv Fusion Module

RGB and LiDAR Fusion-based 3D Semantic Segmentation for Autonomous Driving

RAAFNet: Reverse Attention Adaptive Fusion Network for Large-Scale Point Cloud Semantic Segmentation

Fast Context-Awareness Encoder for LiDAR Point Semantic Segmentation