GeoFormer: Learning Point Cloud Completion with Tri-Plane Integrated Transformer

Jinpeng Yu,Binbin Huang,Yuxuan Zhang,Huaxia Li,Xu Tang,Shenghua Gao
DOI: https://doi.org/10.1145/3664647.3680842
2024-08-13
Abstract:Point cloud completion aims to recover accurate global geometry and preserve fine-grained local details from partial point clouds. Conventional methods typically predict unseen points directly from 3D point cloud coordinates or use self-projected multi-view depth maps to ease this task. However, these gray-scale depth maps cannot reach multi-view consistency, consequently restricting the performance. In this paper, we introduce a GeoFormer that simultaneously enhances the global geometric structure of the points and improves the local details. Specifically, we design a CCM Feature Enhanced Point Generator to integrate image features from multi-view consistent canonical coordinate maps (CCMs) and align them with pure point features, thereby enhancing the global geometry feature. Additionally, we employ the Multi-scale Geometry-aware Upsampler module to progressively enhance local details. This is achieved through cross attention between the multi-scale features extracted from the partial input and the features derived from previously estimated points. Extensive experiments on the PCN, ShapeNet-55/34, and KITTI benchmarks demonstrate that our GeoFormer outperforms recent methods, achieving the state-of-the-art performance. Our code is available at \href{<a class="link-external link-https" href="https://github.com/Jinpeng-Yu/GeoFormer" rel="external noopener nofollow">this https URL</a>}{<a class="link-external link-https" href="https://github.com/Jinpeng-Yu/GeoFormer" rel="external noopener nofollow">this https URL</a>}.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
### The Problem Addressed by the Paper This paper aims to address the problem of Point Cloud Completion. Specifically, the goal of point cloud completion is to recover the accurate overall geometric structure from partial point clouds while preserving fine local details. Traditional methods often directly predict unseen points from 3D point cloud coordinates or use self-projected multi-view depth maps to simplify the task. However, these grayscale depth maps fail to achieve multi-view consistency, thereby limiting performance. This paper proposes a new method called GeoFormer, which enhances point cloud completion through the following two main aspects: 1. **Enhancement of Global Geometric Structure**: A CCM Feature Enhanced Point Generator is designed to align image features from multi-view consistent Canonical Coordinate Maps (CCMs) with pure point features, thereby enhancing global geometric features. 2. **Improvement of Local Details**: A Multi-scale Geometry-aware Upsampler is introduced, which gradually enhances local details through a cross-attention mechanism between multi-scale features of partial inputs and previously estimated points. Extensive experiments on multiple benchmark datasets (such as PCN, ShapeNet-55/34, and KITTI) demonstrate that GeoFormer outperforms existing methods in the point cloud completion task, achieving state-of-the-art performance.