Abstract:Point cloud segmentation is essential for scene understanding, which provides advanced information for many applications, such as autonomous driving, robots, and virtual reality. To improve the accuracy and robustness of point cloud segmentation, many researchers have attempted to fuse camera images to complement the color and texture information. The common fusion strategy is the combination of convolutional operations with concatenation, element-wise addition or element-wise multiplication. However, conventional convolutional operators tend to confine the fusion of modal features within their receptive fields, which can be incomplete and limited. In addition, the inability of encoder-decoder segmentation networks to explicitly perceive segmentation boundary information results in semantic ambiguity and classification errors at object edges. These errors are further amplified in point cloud segmentation tasks, significantly affecting the accuracy of point cloud segmentation. To address the above issues, we propose a novel self-attention multi-modal fusion semantic segmentation network for point cloud semantic segmentation. Firstly, to effectively fuse different modal features, we propose a Self-Cross Fusion Module (SCF), which models long-range modality dependencies and transfers complementary image information to the point cloud to fully leverage the modality-specific advantages. Secondly, we design the Salience Refinement Module (SR), which calculates the importance of channels in the feature maps and global descriptors to enhance the representation capability of salient modal features. Finally, we propose the Local-aware Anisotropy Loss measure the element-level importance in the data and explicitly provide boundary information for the model, which alleviates the inherent semantic ambiguity problem in segmentation networks. Extensive experiments on two benchmark datasets demonstrate that our proposed method surpasses current state-of-the-art methods.

Push-the-Boundary: Boundary-aware Feature Propagation for Semantic Segmentation of 3D Point Clouds

3D Object Segmentation Using Cross-Window Point Transformer with Latent Semantic Boundary Guidance

PointMS: Semantic Segmentation for Point Cloud Based on Multi-scale Directional Convolution

Multi-modal LiDAR Point Cloud Semantic Segmentation with Salience Refinement and Boundary Perception

Superpoint-guided Semi-supervised Semantic Segmentation of 3D Point Clouds

Associate Semantic-Instance Segmentation of 3D Point Clouds Based on Local Feature Extraction

Boundary–Inner Disentanglement Enhanced Learning for Point Cloud Semantic Segmentation

PointNest: Learning Deep Multiscale Nested Feature Propagation for Semantic Segmentation of 3-D Point Clouds

Boundary-Guided Lightweight Semantic Segmentation With Multi-Scale Semantic Context

Boundary-aware Graph Convolution for Semantic Segmentation.

Weakly Supervised Semantic Segmentation of Point Cloud Scenes Using Boundary-based Feature Aggregation

Dense Supervision Propagation for Weakly Supervised Semantic Segmentation on 3D Point Clouds

Contrastive Boundary Learning for Point Cloud Segmentation

Visual Boundary-Guided Pseudo-Labeling for Weakly Supervised 3D Point Cloud Segmentation in Indoor Environments

Background-Aware 3D Point Cloud Segmentationwith Dynamic Point Feature Aggregation

Multi-Scale Point-Wise Convolutional Neural Networks for 3D Object Segmentation From LiDAR Point Clouds in Large-Scale Environments

A Boundary Guided Cross Fusion Approach for Remote Sensing Image Segmentation

Semantic Segmentation of Point Cloud Scene via Multi-Scale Feature Aggregation and Adaptive Fusion

Semantic boundary enhancement and position attention network with long-range dependency for semantic segmentation

Weakly Supervised Point Cloud Segmentation Via Deep Morphological Semantic Information Embedding

GeoSegNet: Point Cloud Semantic Segmentation via Geometric Encoder-Decoder Modeling