Abstract:Large-scale point cloud semantic segmentation is a critical aspect of environmental information perception, with far-reaching applications in domains such as auto-driving, remote sensing, and virtual reality systems. Contemporary methodologies for semantic segmentation of point clouds typically employ the K-Nearest Neighbors (KNN) algorithm to learn local features. Nevertheless, this approach introduces a concern regarding local perceptual ambiguity, while effectively capturing global features dispersed across a large-scale scene still presents a significant challenge. To address these limitations, we present LACV-Net, a neural architecture specifically tailored for semantic segmentation of large-scale point clouds. Our LACV-Net comprises three primary elements: (1) The Local Adaptive Feature Augmentation (LAFA) module, which adaptively learns the similarity weight between the local neighbor, thereby enhance local information and mitigate local perception ambiguity. (2) The Aggregation Loss Function, which uses similarity weighted (derived from our LAFA module) neighboring features as offsets, guiding convergence towards centroid features, thereby constraining the similarity weight and further mitigate local perception ambiguity. (3) The Comprehensive Vector of Locally Aggregated Descriptors (C-VLAD) module that seamlessly fuses local features across multiple resolution representations to generate a comprehensive global description vector, thereby capturing global context more efficiently. We have evaluated the performance of LACV-Net with state-of-the-art networks on several benchmarks as S3DIS, Toronto3D, and SensatUrban. The results demonstrate the superior efficacy of LACV-Net in accurately segmenting and classifying large-scale 3D point cloud scenes, highlighting its potential to advance environmental information perception.

Fusion of images and point clouds for the semantic segmentation of large-scale 3D scenes based on deep learning

3D Semantic Segmentation Using Deep Learning for Large-Scale Indoor Point Cloud

A Prior Level Fusion Approach for the Semantic Segmentation of 3D Point Clouds Using Deep Learning

An RGB-D Fusion Based Semantic Segmentation Algorithm Based on Neighborhood Metric Relations

BEMF-Net: Semantic Segmentation of Large-Scale Point Clouds via Bilateral Neighbor Enhancement and Multi-Scale Fusion

Enhanced Multi-Scale Feature Adaptive Fusion Sparse Convolutional Network for Large-Scale Scenes Semantic Segmentation

PointMS: Semantic Segmentation for Point Cloud Based on Multi-scale Directional Convolution

Towards Deeper and Better Multi-view Feature Fusion for 3D Semantic Segmentation

3D Semantic Segmentation of Large-Scale Point-Clouds in Urban Areas Using Deep Learning

Leveraging Large-Scale Pretrained Vision Foundation Models for Label-Efficient 3D Point Cloud Segmentation

Large-scale point cloud semantic segmentation via local perception and global descriptor vector

Semantic Segmentation of Point Cloud Scene via Multi-Scale Feature Aggregation and Adaptive Fusion

A deep learning-based global and segmentation-based semantic feature fusion approach for indoor scene classification

LLGF-Net: Learning Local and Global Feature Fusion for 3D Point Cloud Semantic Segmentation

Semantic 3D Reconstruction with Learning MVS and 2D Segmentation of Aerial Images

Deep Projective 3D Semantic Segmentation

Multispectral LiDAR Point Cloud Segmentation for Land Cover Leveraging Semantic Fusion in Deep Learning Network

Robust 3D Semantic Segmentation Method Based on Multi-Modal Collaborative Learning

Multi-Scale Point-Wise Convolutional Neural Networks for 3D Object Segmentation From LiDAR Point Clouds in Large-Scale Environments

Joint Semantic Segmentation using representations of LiDAR point clouds and camera images

Semantic segmentation of large-scale point clouds based on dilated nearest neighbors graph