IEFM and IDS: Enhancing 3D Environment Perception Via Information Encoding in Indoor Point Cloud Semantic Segmentation
Kaixiang Huang,Jin Wang,Jingru Yang,Ying Yang,Guodong Lu,Yuzhen Chen,Huan Yu,Qifeng Zhang
DOI: https://doi.org/10.1016/j.neucom.2023.126944
IF: 6
2024-01-01
Neurocomputing
Abstract:The point cloud semantic segmentation plays a significant role in the understanding of 3D environment. However, current 3D point cloud segmentation methods pay little attention in the inevitable blurring and loss of point information within deep network, which considerably impairs the segmentation performance, particularly in the complex and colorful indoor scenes. To overcome the inevitable loss of point information (position, color and normal vector), in this paper, we propose a brand-new Information Encoding and Fusion Module (IEFM), comprising a point Information Encoding Method (IEM) and an optimized Multi-Encoding Fusion Method (EFM). In terms of constructing the innovative bias standardization of point information and effectively merging other information encodings into position encoding, IEFM adaptively complements the loss of the point information and thereby achieving enhanced environment perception ability. Additionally, although the proposed IEFM is capable of managing multi-class point information, the encoding interference resulting from the coexistence of multiple information is quite unavoidable, leading to detrimental consequences for the overall semantic segmentation performance. Therefore, to reduce the information encoding interference problem, we further innovatively propose the Information Distribution Strategy (IDS), in terms of hierarchically distribute kinds of point information, so that the interference of multiple information encodings will be extremely mitigated and achieving more accurate indoor point cloud semantic segmentation. Benefiting from the modular design, the proposed IEFM and IDS can be easily inserted in existing point-based point cloud segmentation models. The experimental results have shown the effectiveness of our proposed methods across multiple state-of-the-art models and benchmarks (S3DIS and ScanNet), achieving competitive performance of 77.4% mIoU on the large-scale S3DIS benchmark.