SRIF: Semantic Shape Registration Empowered by Diffusion-based Image Morphing and Flow Estimation

Mingze Sun,Chen Guo,Puhua Jiang,Shiwei Mao,Yurun Chen,Ruqi Huang
2024-10-03
Abstract:In this paper, we propose SRIF, a novel Semantic shape Registration framework based on diffusion-based Image morphing and Flow estimation. More concretely, given a pair of extrinsically aligned shapes, we first render them from multi-views, and then utilize an image interpolation framework based on diffusion models to generate sequences of intermediate images between them. The images are later fed into a dynamic 3D Gaussian splatting framework, with which we reconstruct and post-process for intermediate point clouds respecting the image morphing processing. In the end, tailored for the above, we propose a novel registration module to estimate continuous normalizing flow, which deforms source shape consistently towards the target, with intermediate point clouds as weak guidance. Our key insight is to leverage large vision models (LVMs) to associate shapes and therefore obtain much richer semantic information on the relationship between shapes than the ad-hoc feature extraction and alignment. As a consequence, SRIF achieves high-quality dense correspondences on challenging shape pairs, but also delivers smooth, semantically meaningful interpolation in between. Empirical evidence justifies the effectiveness and superiority of our method as well as specific design choices. The code is released at <a class="link-external link-https" href="https://github.com/rqhuang88/SRIF" rel="external noopener nofollow">this https URL</a>.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to estimate the dense correspondences between 3D shapes, especially to obtain semantically meaningful dense correspondences between shapes undergoing more complex and extensive deformations. Specifically: 1. **Problem Background**: - In computer graphics, estimating the dense correspondences between 3D shapes is the basis for many applications, such as 3D reconstruction, animation, and statistical shape analysis. - For rigid or isometrically deformed shapes, existing methods have achieved a solid foundation both theoretically and practically. - However, for shapes undergoing more complex and extensive deformations, existing methods have difficulty obtaining high - quality dense correspondences. 2. **Limitations of Existing Methods**: - Pure geometric methods usually adopt a coarse - to - fine approach. They find a small number of landmark points through geometric features and estimate sparse correspondences, and finally propagate the dense correspondences by minimizing distortion. But these sparse correspondences are not necessarily semantically related. - High - quality semantic correspondence methods based on user - defined landmarks rely on manual annotation, which limits automation and practicality. - Although learning methods can extract semantic information from data, due to the limited 3D data, they are usually limited to specific categories, which weakens their practicality. 3. **The Method Proposed in the Paper**: - This paper proposes SRIF (Semantic Shape Registration Empowered by Diffusion - based Image Morphing and Flow Estimation), a semantic shape registration framework based on diffusion - model - based image deformation and flow estimation. - The core idea of SRIF is to use large - scale visual models (LVMs) to dynamically associate 3D shapes, generate intermediate images and reconstruct intermediate point clouds through a dynamic 3D Gaussian splashing reconstruction framework, and finally estimate a continuous normalized flow to deform the source shape close to the target shape. - Through this method, SRIF can not only achieve high - quality dense correspondences, but also generate a continuous and semantically meaningful deformation process. 4. **Main Contributions**: - A new semantic shape registration framework SRIF is proposed, which can obtain high - quality dense correspondences between complexly deformed shapes. - Intermediate point clouds are generated by using diffusion models and dynamic 3D Gaussian splashing reconstruction, providing rich semantic information. - Experiments prove that SRIF significantly outperforms existing methods in multiple benchmark tests and can handle cross - category shape registration tasks. In summary, this paper aims to solve the problem of semantically dense correspondences between complexly deformed shapes, proposes an innovative framework SRIF, and by combining diffusion models and flow estimation, achieves high - quality dense correspondences and a semantically meaningful deformation process.