Abstract:Autonomous part assembly is a challenging yet crucial task in 3D computer vision and robotics. Analogous to buying an IKEA furniture, given a set of 3D parts that can assemble a single shape, an intelligent agent needs to perceive the 3D part geometry, reason to propose pose estimations for the input parts, and finally call robotic planning and control routines for actuation. In this paper, we focus on the pose estimation subproblem from the vision side involving geometric and relational reasoning over the input part geometry. Essentially, the task of generative 3D part assembly is to predict a 6-DoF part pose, including a rigid rotation and translation, for each input part that assembles a single 3D shape as the final output. To tackle this problem, we propose an assembly-oriented dynamic graph learning framework that leverages an iterative graph neural network as a backbone. It explicitly conducts sequential part assembly refinements in a coarse-to-fine manner, exploits a pair of part relation reasoning module and part aggregation module for dynamically adjusting both part features and their relations in the part graph. We conduct extensive experiments and quantitative comparisons to three strong baseline methods, demonstrating the effectiveness of the proposed approach.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is to achieve autonomous part assembly in 3D computer vision and robotics. Specifically, given a set of 3D parts that can be assembled into a single shape, an intelligent agent needs to perceive the geometries of the 3D parts, infer and propose pose estimations of the input parts, and finally call the robot planning and control programs for operation. The paper particularly focuses on the pose estimation sub - problem involving geometric and relational reasoning from a visual perspective, that is, predicting a 6 - degree - of - freedom (6 - DoF) part pose including rigid - body rotation and translation for each input part to assemble into the final 3D shape. ### Specific problems solved by the paper 1. **Part pose estimation**: - The task objective is to predict a 6 - degree - of - freedom (6 - DoF) pose for each input part, including rigid - body rotation and translation, in order to assemble a complete 3D shape. - This problem involves complex geometric and relational reasoning and requires finding a suitable assembly solution in a large space of possible solutions. 2. **Dynamic graph learning framework**: - To address the above challenges, the paper proposes a dynamic graph learning framework based on the iterative graph neural network (GNN). - This framework dynamically adjusts the features and their relationships in the part graph by explicitly performing sequential part assembly refinement, using the part relationship reasoning module and the part aggregation module. 3. **Multi - modal prediction**: - Since there may be multiple solutions for part assembly, the paper adopts the Min - of - N (MoN) loss function to balance the assembly quality and diversity. - This loss function encourages at least one prediction result to be close to the real data, thereby improving the robustness and adaptability of the model. ### Main contributions 1. **Dynamic graph learning framework**: - A new dynamic graph learning framework is proposed, which can gradually refine part pose estimation and perform part assembly from coarse to fine. - Through the iterative graph neural network and the dynamic part relationship reasoning module, the complex relationships between parts are effectively handled. 2. **Part relationship reasoning**: - A dynamic part relationship reasoning module is introduced, which can update the part relationship graph according to the current part pose estimation in each iteration. - This method can better capture the mutual influence between parts and improve the accuracy of assembly. 3. **Part aggregation module**: - A dynamic part aggregation module is designed. By alternately updating the graph structure between the dense node set and the sparse node set, direct information exchange between geometrically equivalent parts is achieved. - This method helps to synchronously share information while learning different part poses. 4. **Experimental verification**: - Extensive experiments were carried out on the large - scale synthetic dataset PartNet, demonstrating the effectiveness of the proposed method. - Quantitative and qualitative comparisons were made with three baseline methods, and the results show that the proposed method significantly outperforms the baseline methods on multiple evaluation metrics. ### Conclusion This paper successfully solves the pose estimation problem in 3D part assembly by proposing a dynamic graph learning framework. This framework can not only handle complex geometric and relational reasoning but also generate diverse assembly results, and has high practical value and research significance.

Generative 3D Part Assembly via Dynamic Graph Learning

Assembly Sequence Generation for New Objects Via Experience Learned from Similar Object

Towards Learning from Demonstration System for Parts Assembly: A Graph Based Representation for Knowledge

KGAssembly: Knowledge Graph-Driven Assembly Process Generation and Evaluation for Complex Components

Generative 3D Part Assembly via Part-Whole-Hierarchy Message Passing

3D Part Assembly Generation With Instance Encoded Transformer

Probabilistic Graph Based Spatial Assembly Relation Inference for Programming of Assembly Task by Demonstration.

Category-Level Multi-Part Multi-Joint 3D Shape Assembly

Learning Part Generation and Assembly for Structure-aware Shape Synthesis

Neural Assembler: Learning to Generate Fine-Grained Robotic Assembly Instructions from Multi-View Images

Learning Part Generation and Assembly for Sketching Man‐Made Objects

Subassembly to Full Assembly: Effective Assembly Sequence Planning through Graph-based Reinforcement Learning

Study on Generation of 3D Assembly Dimension Chain

3D Assembly Completion.

Leveraging SE(3) Equivariance for Learning 3D Geometric Shape Assembly

Score-PA: Score-based 3D Part Assembly

Learning Dynamic Relationships for 3D Human Motion Prediction

Component-aware Generative Autoencoder for Structure Hybrid and Shape Completion

Rearrangement Planning for General Part Assembly

Automatic Generation of 3D Assembly Dimension Chain Based on Feature Model

Multi-level Reasoning for Robotic Assembly: From Sequence Inference to Contact Selection