Learning Local Contextual Features for 3D Point Clouds Semantic Segmentation by Attentive Kernel Convolution

Guofeng Tong,Yuyuan Shao,Hao Peng
DOI: https://doi.org/10.1007/s00371-023-02819-9
IF: 2.835
2023-01-01
The Visual Computer
Abstract:Unlike 2D images that are represented in regular grids, 3D point clouds are irregular and unordered, hence directly applying convolution neural networks (CNNs) to process point clouds is quite challenging. In this paper, we propose a novel deep neural network named AKNet to achieve point cloud semantic segmentation. The key to our AKNet is the attentive kernel convolution (AKConv), which is a deformed convolution operation for perceiving sufficient local context of 3D scenes. AKConv first constructs the Basic Weight Units that are robust to point’s ordering. Then, for capturing the more distinctive local features, the convolution kernels of AKConv are associated with Attentive Weight Units through the self-attentive function acting on Basic Weight Units. Furthermore, 3D point clouds provide richer geometric shape information, which is helpful to recognize objects. However, inputting only raw point features to the convolution function could cause geometric information loss. Thus, we utilize augmented features as input of AKConv. Besides, to preserve the semantic information from the encoding to decoding layers, we introduce the backward encoding (BE) mechanism by utilizing higher-layer semantic features. We conduct experiments on three large-scale point clouds datasets. The experimental results demonstrate that our AKNet outperforms state-of-the-art (SOTA) networks.
What problem does this paper attempt to address?