PyTorchGeoNodes: Enabling Differentiable Shape Programs for 3D Shape Reconstruction

Sinisa Stekovic,Stefan Ainetter,Mattia D'Urso,Friedrich Fraundorfer,Vincent Lepetit
2024-04-16
Abstract:We propose PyTorchGeoNodes, a differentiable module for reconstructing 3D objects from images using interpretable shape programs. In comparison to traditional CAD model retrieval methods, the use of shape programs for 3D reconstruction allows for reasoning about the semantic properties of reconstructed objects, editing, low memory footprint, etc. However, the utilization of shape programs for 3D scene understanding has been largely neglected in past works. As our main contribution, we enable gradient-based optimization by introducing a module that translates shape programs designed in Blender, for example, into efficient PyTorch code. We also provide a method that relies on PyTorchGeoNodes and is inspired by Monte Carlo Tree Search (MCTS) to jointly optimize discrete and continuous parameters of shape programs and reconstruct 3D objects for input scenes. In our experiments, we apply our algorithm to reconstruct 3D objects in the ScanNet dataset and evaluate our results against CAD model retrieval-based reconstructions. Our experiments indicate that our reconstructions match well the input scenes while enabling semantic reasoning about reconstructed objects.
Computer Vision and Pattern Recognition,Machine Learning
What problem does this paper attempt to address?
### What problem does this paper attempt to solve? This paper primarily aims to address the problem of reconstructing 3D objects from images, with a particular focus on achieving this using interpretable shape programs. Specifically, the paper proposes a new framework called PyTorchGeoNodes, which is designed to optimize shape program parameters from images to generate high-quality 3D object reconstructions. **Core Issues:** 1. **Semantic Understanding and Editability of 3D Reconstruction**: Traditional CAD model retrieval methods can generate visually appealing 3D objects but lack the ability to understand and edit the semantic attributes of the reconstructed objects. 2. **Low Memory Consumption**: Existing methods such as SDFs and NeRFs can handle unseen shapes but tend to produce artifacts when parts of the object are occluded and consume a significant amount of memory. 3. **Shape Parameter Optimization**: Recovering shape program parameters from images is highly challenging, especially the joint optimization of continuous and discrete parameters. **Main Contributions:** 1. Proposed a "compiler" that can convert shape programs in Blender into PyTorch code, making the 3D shape generation process differentiable and thus facilitating the optimization of continuous parameters. 2. Utilized an improved Monte Carlo Tree Search (MCTS) algorithm to optimize both continuous and discrete parameters of the shape programs. 3. Validated the effectiveness of the proposed method through experiments on synthetic data and real-world scenarios, demonstrating its advantages in semantic component inference.