Abstract:Given the prominence of 3-D sensors in recent years, 3-D point clouds are worthy to be further investigated for environment perception and scene understanding. Learning accurate local and global contexts in point clouds is pivotal for semantic segmentation, and neighbor aggregation (NA) and transformers have achieved notable success in local and global perception in point cloud analysis, respectively. Nevertheless, studying each independently is far from the optimal solution for comprehensive feature learning. To address this, we take a novel step toward investigating and integrating the structures of NA and transformers. In this article, we introduce Point Neighbor Aggregation with Transformer (PointNAT), a conceptually straightforward and effective approach aiming to enhance the performance of 3-D point cloud semantic segmentation. PointNAT consists of an NA block (NAB) for local perception, a point transformer block (PTB) for global modeling, and a hybrid block to connect NABs and PTBs. NABs effectively learn complex local features at varying scales through an improved NA operation and a multihead mechanism. PTBs efficiently perform global attention using a small set of learnable key points. Hybrid blocks serve as high-and-low frequency signal hybridizers, merging the strengths of these two blocks by adaptively assigning hybrid weights to local and global contexts. We have evaluated the performance of PointNAT with state-of-the-art networks on several benchmarks, including Stanford Large-Scale 3-D Indoor Spaces (S3DIS), Toronto3D, and SensatUrban. PointNAT achieves mean intersection over union (mIoU) scores of 77.8%, 84.7%, and 65.2% in these three datasets. Furthermore, it outperforms the baseline approach PointNeXt by 3.0%, 1.3%, and 4.2% while utilizing only 59.9% of the parameters and 15.2% of the floating-point operations (FLOPs). The results demonstrate PointNAT's superior ability in accurately segmenting large-scale 3-D point cloud scenes, emphasizing its potential to advance environment perception and scene understanding. Our code is available at https://github.com/zeng-ziyin/PointNAT.

PointNAC: Copula-Based Point Cloud Semantic Segmentation Network

Associate Semantic-Instance Segmentation of 3D Point Clouds Based on Local Feature Extraction

3D Semantic Segmentation Using Deep Learning for Large-Scale Indoor Point Cloud

PointMS: Semantic Segmentation for Point Cloud Based on Multi-scale Directional Convolution

Point Attention Network for Point Cloud Semantic Segmentation.

PointNest: Learning Deep Multiscale Nested Feature Propagation for Semantic Segmentation of 3-D Point Clouds

PointMM: Point Cloud Semantic Segmentation CNN under Multi-Spatial Feature Encoding and Multi-Head Attention Pooling

Superpoint-guided Semi-supervised Semantic Segmentation of 3D Point Clouds

PointNAT: Large-Scale Point Cloud Semantic Segmentation via Neighbor Aggregation With Transformer

3D Reconstruction and Semantic Segmentation Method Combining PointNet and 3D-Lmnet from Single Image

DenseKPNET: Dense Kernel Point Convolutional Neural Networks for Point Cloud Semantic Segmentation

Large-scale point cloud semantic segmentation via local perception and global descriptor vector

Point Attention Network for Semantic Segmentation of 3D Point Clouds

RAAFNet: Reverse Attention Adaptive Fusion Network for Large-Scale Point Cloud Semantic Segmentation

Semantic Segmentation of Aerial Laser Point Clouds Based on Deep-Residual Enhanced Coding of Multi-Feature Information

Multi-Scale Superpoint Network for 3D Point Cloud Semantic Segmentation.

High-Performance Feature Extraction Network for Point Cloud Semantic Segmentation

SPDC: A SUPER-POINT AND POINT COMBINING BASED DUAL-SCALE CONTRASTIVE LEARNING NETWORK FOR POINT CLOUD SEMANTIC SEGMENTATION

MSIDA-Net: Point Cloud Semantic Segmentation via Multi-Spatial Information and Dual Adaptive Blocks

Semantic Segmentation of 3D Point Clouds Based on High Precision Range Search Network

DCNet: Large-Scale Point Cloud Semantic Segmentation with Discriminative and Efficient Feature Aggregation