Abstract:The semantic segmentation of point clouds is an important part of the environment perception for robots. However, it is difficult to directly adopt the traditional 3D convolution kernel to extract features from raw 3D point clouds because of the unstructured property of point clouds. In this paper, a spherical interpolated convolution operator is proposed to replace the traditional grid-shaped 3D convolution operator. This newly proposed feature extraction operator improves the accuracy of the network and reduces the parameters of the network. In addition, this paper analyzes the defect of point cloud interpolation methods based on the distance as the interpolation weight and proposes the self-learned distance-feature density by combining the distance and the feature correlation. The proposed method makes the feature extraction of spherical interpolated convolution network more rational and effective. The effectiveness of the proposed network is demonstrated on the 3D semantic segmentation task of point clouds. Experiments show that the proposed method achieves good performance on the ScanNet dataset and Paris-Lille-3D dataset.

What problem does this paper attempt to address?

This paper attempts to solve several key problems in point cloud semantic segmentation: 1. **Unstructured nature of point clouds**: Traditional 3D convolution kernels are difficult to be directly applied to the original 3D point clouds because point cloud data is unstructured. The paper proposes a spherical interpolated convolution operator (Spherical Interpolated Convolution Operator) to replace the traditional grid - like 3D convolution operator, so as to better extract point cloud features. 2. **Reducing network parameters**: By designing a new spherical interpolated convolution operator, the paper reduces the number of network parameters, thereby reducing memory usage and at the same time improving network accuracy. 3. **Defects in point cloud interpolation methods**: Distance - based point cloud interpolation methods have defects, for example, the interpolation weights are inconsistent in different density regions. The paper proposes self - learning distance - feature density (Distance - Feature Density), which combines distance and feature correlation, making feature extraction more reasonable and effective. 4. **Improving semantic segmentation performance**: The method proposed in the paper has achieved good performance in 3D semantic segmentation tasks. In particular, the experimental results on the ScanNet and Paris - Lille - 3D datasets show that this method is competitive. ### Main contributions 1. **Spherical interpolated convolution operator**: Aiming at unstructured point clouds in 3D space, a dense spherical interpolated convolution operator is proposed. Under the same network structure, this operator uses fewer learning parameters than the 3D grid - like convolution operator, thus reducing memory usage. 2. **Distance - feature density**: According to the characteristics of spatial feature calculation, the concept of distance - feature density is proposed. By effectively learning and combining distance - feature density for spatial feature calculation, the calculation of spatial feature is made more reasonable and effective. 3. **Semantic segmentation network design**: Based on the proposed spherical interpolated convolution operator and distance - feature density, a semantic segmentation network is designed. The experimental results show that this network performs excellently in 3D semantic segmentation tasks, especially on the ScanNet and Paris - Lille - 3D datasets. ### Method overview 1. **Spherical interpolated convolution operator**: - Use the farthest point sampling (Farthest Point Sampling, FPS) technique to sample output points from input points. - The spherical unit corresponding to each convolution kernel center obtains the unit features by interpolating the features of surrounding points. - Use 3D convolution to further process the interpolated features. 2. **Distance - feature density**: - Collect distance information and feature information within a small neighborhood of each point through ball query. - Use 1×1 convolution and ReLU activation function to discover the internal relationship between distance information and feature information. - Extract the aggregated density feature through max pooling. - Send the density feature to a multi - layer perceptron (MLP) and obtain the distance - feature density through the Sigmoid activation function. - During the interpolation process, multiply the interpolation weight by the reciprocal of the distance - feature density to adjust the contribution degree of the feature. 3. **Network structure**: - The encoding part contains 5 layers, and each layer contains two spherical interpolated convolution operators, one of which is used for down - sampling and the other for feature extraction. - The decoding part also contains 5 layers, and each layer uses a spherical interpolated convolution operator for feature extraction and transmits features through skip connection. - Finally, predict the final result through a fully connected layer (Fully Connected Layer) and Dropout. ### Experimental results The paper conducted experiments on the ScanNet and Paris - Lille - 3D datasets. The results show that the proposed spherical interpolated convolution network has achieved good performance in 3D semantic segmentation tasks, especially while reducing network parameters, it has maintained high accuracy.

Spherical Interpolated Convolutional Network with Distance-Feature Density for 3D Semantic Segmentation of Point Clouds

Spherical Interpolated Convolutional Network With Distance-Feature Density for 3-D Semantic Segmentation of Point Clouds

Pass3d: Precise And Accelerated Semantic Segmentation For 3d Point Cloud

Associate Semantic-Instance Segmentation of 3D Point Clouds Based on Local Feature Extraction

Superpoint-guided Semi-supervised Semantic Segmentation of 3D Point Clouds

3D Semantic Segmentation Using Deep Learning for Large-Scale Indoor Point Cloud

Semantic Segmentation of 3D Point Clouds Based on High Precision Range Search Network

PointMS: Semantic Segmentation for Point Cloud Based on Multi-scale Directional Convolution

3D Object Segmentation Using Cross-Window Point Transformer with Latent Semantic Boundary Guidance

Three-Dimensional Point Cloud Semantic Segmentation Network Based on Spatial Graph Convolution Network

Hierarchical Depthwise Graph Convolutional Neural Network for 3D Semantic Segmentation of Point Clouds

Interpolated Convolutional Networks for 3D Point Cloud Understanding

High-Performance Feature Extraction Network for Point Cloud Semantic Segmentation

Dilated Nearest-Neighbor Encoding for 3D Semantic Segmentation of Point Clouds

Semantic segmentation of large-scale point clouds based on dilated nearest neighbors graph

DPC-Net: Distributed Point Convolution Network for large-scale point clouds semantic segmentation

Exploring Spatial Context for 3D Semantic Segmentation of Point Clouds

Semantic Labeling and Instance Segmentation of 3D Point Clouds Using Patch Context Analysis and Multiscale Processing

3D Graph Embedding Learning with a Structure-aware Loss Function for Point Cloud Semantic Instance Segmentation

Point Attention Network for Semantic Segmentation of 3D Point Clouds

SWCF-Net: Similarity-weighted Convolution and Local-global Fusion for Efficient Large-scale Point Cloud Semantic Segmentation