Explicit 3D Reconstruction from Images with Dynamic Graph Learning and Rendering-Guided Diffusion

Di Wu,Linli Zhou,Jincheng Li,Jianqiao Xiong,Liangtu Song
DOI: https://doi.org/10.1016/j.neucom.2024.128206
IF: 6
2024-01-01
Neurocomputing
Abstract:High-quality 3D reconstruction is becoming increasingly important in a variety of fields. Recently, implicit representation methods have made significant progress in image-based 3D reconstruction. However, these methods tend to yield entangled neural representations which lack support for standard 3D pipelines, and their reconstruction results usually experience a sharp drop in quality when input views are reduced. To obtain high-quality 3D content, we propose an explicit 3D reconstruction method that directly extracts textured meshes from images and remains robust using reduced input views. Our central components include a dynamic graph convolutional network (GCN) and a rendering-guided diffusion model. The dynamic GCN aims to improve mesh reconstruction quality by effectively aggregating features from vertex neighborhoods. The aggregation is accelerated through sampling geometric-related neighbors with different SDF signs, which gradually converges in quantity during training. The rendering-guided diffusion model learns prior distributions for unseen regions to improve reconstruction performance using sparse-view inputs. It uses the rendered image under an interpolated camera pose as conditioned input and its diffusion strength can be controlled with the rendering loss of explicit reconstruction. In addition, the rendering-guided diffusion model can be jointly trained to generate plausible novel views with 3D consistency. Experiments demonstrate that our method can produce high-quality explicit reconstruction results and maintain realistic reconstruction using sparse-view inputs.
What problem does this paper attempt to address?