MLFNet- Point Cloud Semantic Segmentation Convolution Network Based on Multi-Scale Feature Fusion

Jingfang Yang,Bochang Zou,Huadong Qiu,Zhi Li
DOI: https://doi.org/10.1109/access.2021.3057612
IF: 3.9
2021-01-01
IEEE Access
Abstract:In the semantic segmentation of a point cloud, if the spatial structure correlation between the input features and coordinates are not fully considered, a semantic segmentation error can occur. We propose a method of spatial convolution that makes full use of the characteristics of a multiscale spatial structure by combining local and global features. We call this method MLFNet. We also propose a multiscale feature framework. First, the point cloud is simplified by obtaining the weighted farthest point (by down-sampling combined with farthest-point sampling and the weighted average). The near-near domain of each sampling point is then obtained by a KK octant search (an octant search optimized by the k-nearest neighbor and a custom threshold), and feature information is obtained. The feature information is added to the subsequent multilayer perceptron, and fusion of local context information is achieved. Finally, the fusion features in multiple directions are maximally pooled. Our method was tested on self-made datasets and other standard basic datasets (ModelNet40, ShapeNet, and Stanford large-scale 3D indoor spaces (S3DIS) data). The accuracy of segmentation was 0.937 in our dataset; two percentage points higher than the latest deep learning method. Also, our method obtained a mean intersection over a union of 0.867 in ShapeNet, which was 0.3 percentage points higher than the latest PointGrid. The accuracy on S3DIS was 0.8153, which was three percentage points higher than the latest spatial aggregation net. The results of semantic segmentation verified the superiority of the proposed method.
computer science, information systems,telecommunications,engineering, electrical & electronic
What problem does this paper attempt to address?
The paper introduces a new method for point cloud semantic segmentation called MLFNet, which stands for Multi-Scale Feature Fusion Network. The primary objective of the paper is to address issues in semantic segmentation of point clouds, particularly focusing on improving the accuracy and efficiency of segmentation. ### Problem Addressed The authors aim to solve the following problems: 1. **Spatial Structure Correlation:** Existing methods do not fully utilize the spatial structure correlation between input features and coordinates, leading to segmentation errors. 2. **Feature Loss During Conversion:** Methods that convert point clouds into 2D images lose necessary features during the transformation process, resulting in overfitting and unclear features. 3. **Irregularity and Disorder:** Point clouds are inherently irregular and disordered, which poses challenges for segmentation. 4. **Loss of Information:** Existing methods suffer from information loss due to downsampling, normalization, and other preprocessing steps. 5. **Multiscale Feature Fusion:** While multiscale feature fusion has shown promise, there are still issues such as edge effects and inefficiency in processing large-scale datasets. ### Proposed Solution To address these issues, the authors propose MLFNet, which combines local and global features using spatial convolution. Key components of their approach include: 1. **Farthest-point Weighted Mean Down-sampling (FWD):** Used to preserve the spatial struct