TSPconv-Net: Transformer and Sparse Convolution for 3D Instance Segmentation in Point Clouds

Xiaojuan Ning,Yule Liu,Yishu Ma,Zhiwei Lu,Haiyan Jin,Zhenghao Shi,Yinghui WangÂ
DOI: https://doi.org/10.3390/math12182926
IF: 2.4
2024-09-21
Mathematics
Abstract:Current deep learning approaches for indoor 3D instance segmentation often rely on multilayer perceptrons (MLPs) for feature extraction. However, MLPs struggle to effectively capture the complex spatial relationships inherent in 3D scene data. To address this issue, we propose a novel and efficient framework for 3D instance segmentation called TSPconv-Net. In contrast to existing methods that primarily depend on MLPs for feature extraction, our framework integrates a more robust feature extraction model comprising the offset-attention (OA) mechanism and submanifold sparse convolution (SSC). The proposed framework is an end-to-end network architecture. TSPconv-Net consists of a backbone network followed by a bounding box module. Specifically, the backbone network utilizes the OA mechanism to extract global features and employs SSC for local feature extraction. The bounding box module then conducts instance segmentation based on the extracted features. Experimental results demonstrate that our approach outperforms existing work on the S3DIS dataset while maintaining computational efficiency. TSPconv-Net achieves 68.6% mPrec, 52.5% mRec, and 60.1% mAP on the test set, surpassing 3D-BoNet by 3.0% mPrec, 5.4% mRec, and 2.6% mAP. Furthermore, it demonstrates high efficiency, completing computations in just 326 s.
mathematics
What problem does this paper attempt to address?