3D Focusing-and-Matching Network for Multi-Instance Point Cloud Registration

Liyuan Zhang,Le Hui,Qi Liu,Bo Li,Yuchao Dai
2024-11-12
Abstract:Multi-instance point cloud registration aims to estimate the pose of all instances of a model point cloud in the whole scene. Existing methods all adopt the strategy of first obtaining the global correspondence and then clustering to obtain the pose of each instance. However, due to the cluttered and occluded objects in the scene, it is difficult to obtain an accurate correspondence between the model point cloud and all instances in the scene. To this end, we propose a simple yet powerful 3D focusing-and-matching network for multi-instance point cloud registration by learning the multiple pair-wise point cloud registration. Specifically, we first present a 3D multi-object focusing module to locate the center of each object and generate object proposals. By using self-attention and cross-attention to associate the model point cloud with structurally similar objects, we can locate potential matching instances by regressing object centers. Then, we propose a 3D dual masking instance matching module to estimate the pose between the model point cloud and each object proposal. It performs instance mask and overlap mask masks to accurately predict the pair-wise correspondence. Extensive experiments on two public benchmarks, Scan2CAD and ROBI, show that our method achieves a new state-of-the-art performance on the multi-instance point cloud registration task. Code is available at <a class="link-external link-https" href="https://github.com/zlynpu/3DFMNet" rel="external noopener nofollow">this https URL</a>.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problem that this paper attempts to solve is multi - instance point cloud registration. Specifically, the goal is to estimate the poses of all instances of the model point cloud in the entire scene. Existing methods usually adopt the strategy of first obtaining global correspondences and then clustering to obtain the pose of each instance. However, objects in the scene may be very cluttered and occluded, which makes it difficult to accurately obtain the correspondences between the model point cloud and all instances in the scene. To this end, the authors propose a simple yet powerful 3D Focusing - and - Matching Network to solve the multi - instance point cloud registration problem by learning the registration between multiple point cloud pairs. ### Main Contributions 1. **New Processing Flow**: Different from existing methods (such as PointCLM and MIRETR) that mainly learn one - to - many correspondences between a CAD model and multiple objects, the method in this paper decomposes the one - to - many problem into multiple one - to - one registration problems by first detecting object centers and then learning the matching between the CAD model and each object proposal. 2. **Performance Improvement**: This method achieves new state - of - the - art performance on the Scan2CAD and ROBI datasets. Especially on the challenging ROBI dataset, the performance is significantly better than the previous state - of - the - art method MIRETR, with an improvement of about 7% in the MR, MP, and MF metrics. 3. **Generality**: The step - by - step decomposition method proposed in this paper, which transforms multi - instance point cloud registration into multiple one - to - one registrations, also has important implications for other tasks such as multi - target tracking and map construction. ### Method Overview 1. **3D Multi - Object Focusing Module**: This module aims to regress the centers of latent objects and generate high - quality object proposals. By learning the correlation between the model point cloud and the scene point cloud, it predicts the offset of each point to its instance center and uses the DBSCAN algorithm to cluster to obtain the object centers. 2. **3D Dual - Mask Instance Matching Module**: This module extracts accurate registration correspondences from local object proposals by learning instance masks and overlap masks. The instance mask is used to filter background points, and the overlap mask is used to improve the partial registration of incomplete objects. ### Loss Functions - **Focusing Loss**: It includes circle loss, L1 regression loss, and direction loss, which are used to learn better shape features and localization accuracy. - **Matching Loss**: It includes circle loss, negative log - likelihood loss, and mask prediction loss, which are used to learn the coarse features and precise matching of point cloud registration. ### Experimental Results - **Datasets**: The experiments were carried out on two publicly available benchmark datasets, Scan2CAD and ROBI. - **Evaluation Metrics**: They include mean recall (MR), mean precision (MP), and mean F1 - score (MF). - **Performance Comparison**: On the ROBI dataset, the performance of the method in this paper is significantly better than other methods, especially with an improvement of about 7% in the MR, MP, and MF metrics respectively. ### Conclusion This paper proposes a new 3D Focusing - and - Matching Network, which effectively solves the multi - instance point cloud registration problem in complex scenes by decomposing the multi - instance point cloud registration problem into multiple one - to - one registration problems. The experimental results show that this method achieves excellent performance on multiple datasets, demonstrating its potential in practical applications.