PCRDiffusion: Diffusion Probabilistic Models for Point Cloud Registration

Yue Wu,Yongzhe Yuan,Xiaolong Fan,Xiaoshui Huang,Maoguo Gong,Qiguang Miao
DOI: https://doi.org/10.48550/arXiv.2312.06063
2023-12-11
Abstract:We propose a new framework that formulates point cloud registration as a denoising diffusion process from noisy transformation to object transformation. During training stage, object transformation diffuses from ground-truth transformation to random distribution, and the model learns to reverse this noising process. In sampling stage, the model refines randomly generated transformation to the output result in a progressive way. We derive the variational bound in closed form for training and provide implementations of the model. Our work provides the following crucial findings: (i) In contrast to most existing methods, our framework, Diffusion Probabilistic Models for Point Cloud Registration (PCRDiffusion) does not require repeatedly update source point cloud to refine the predicted transformation. (ii) Point cloud registration, one of the representative discriminative tasks, can be solved by a generative way and the unified probabilistic formulation. Finally, we discuss and provide an outlook on the application of diffusion model in different scenarios for point cloud registration. Experimental results demonstrate that our model achieves competitive performance in point cloud registration. In correspondence-free and correspondence-based scenarios, PCRDifussion can both achieve exceeding 50\% performance improvements.
Computer Vision and Pattern Recognition,Artificial Intelligence
What problem does this paper attempt to address?
The problem that this paper attempts to solve is in the point - cloud registration task, how to directly predict the target transformation from random transformations through the diffusion probability model without repeatedly updating the source point cloud to optimize the predicted transformation parameters. Traditional methods usually rely on iterative strategies to gradually optimize the alignment between the source point cloud and the target point cloud, which not only increases the computational complexity but may also fall into local optimal solutions. This paper proposes a new framework - PCRDiffusion, which regards the point - cloud registration task as a denoising diffusion process from noisy transformations to target transformations, thus providing a solution without the need for iterative updates of the source point cloud. Specifically, the main contributions of the paper include: 1. **Applying the diffusion model to point - cloud registration for the first time**: A new framework is proposed, modeling the point - cloud registration task as a generative denoising process, which is the first time that the diffusion model has been applied to the point - cloud registration task. 2. **Advantages of the "noise - to - target - transformation" paradigm**: - **No need to iteratively update the source point cloud**: During the training and inference processes, the network only needs to be trained once and does not need to update the source point cloud through iterative strategies. - **Flexible adjustment of denoising sampling steps**: The number of denoising sampling steps can be adjusted as needed to improve the registration accuracy or accelerate the inference speed. 3. **Extensive experimental verification**: The paper conducts experiments on correspondence - based and non - correspondence - based methods to verify the effectiveness and superior performance of PCRDiffusion. The experimental results show that PCRDiffusion has achieved significant improvements in both registration accuracy and inference speed. ### Mathematical Model The mathematical model in the paper mainly involves the forward process and the reverse process of the diffusion probability model. The forward process is to gradually add Gaussian noise from the target transformation \(G_0\) to generate a series of noisy transformations \(G_T, G_{T - 1},\ldots, G_0\). The reverse process is to gradually denoise from the noisy transformation \(G_T\) and finally recover the target transformation \(G_0\). #### Forward Process The forward process is defined as: \[q(G_{1:T}|G_0):=\prod_{t = 1}^T q(G_t|G_{t - 1})\] where, \[q(G_t|G_{t - 1}):=\mathcal{N}(G_t;\sqrt{1-\beta_t}G_{t - 1},\beta_tI)\] #### Reverse Process The reverse process is defined as: \[p_\theta(G_{0:T}):=p(G_T)\prod_{t = 1}^T p_\theta(G_{t - 1}|G_t)\] where, \[p_\theta(G_{t - 1}|G_t):=\mathcal{N}(G_{t - 1};\mu_\theta(G_t,t),\Sigma_\theta(G_t,t))\] #### Training Objective The training objective is to maximize the log - likelihood of the transformation: \[\mathbb{E}_{q(G_0)}[-\log p_\theta(G_0)]\] Since directly optimizing the exact log - likelihood is not feasible, the variational lower bound is used for optimization: \[\mathbb{E}_{q(G_0)}[-\log p_\theta(G_0)]\leq\mathbb{E}_{q(G_{0:T})}[-\log p_\theta(G_{0:T})+\log q(G_{1:T}|G_0)]\] The final training objective can be expressed as: \[\max_\theta\mathbb{E}_{G_0\sim q(G_0),G_{1:T}\sim q(G_{1:T}|G_0)}\left[\sum_{t = 1}^T\log p_\theta(G_{t - 1}|G_t)\right]\] ### Experimental Results The paper conducts experiments on multiple datasets, including the synthetic dataset Mo