Learning Generalizable Tool-use Skills through Trajectory Generation

Carl Qi,Yilin Wu,Lifan Yu,Haoyue Liu,Bowen Jiang,Xingyu Lin,David Held

2024-04-24

Abstract:Autonomous systems that efficiently utilize tools can assist humans in completing many common tasks such as cooking and cleaning. However, current systems fall short of matching human-level of intelligence in terms of adapting to novel tools. Prior works based on affordance often make strong assumptions about the environments and cannot scale to more complex, contact-rich tasks. In this work, we tackle this challenge and explore how agents can learn to use previously unseen tools to manipulate deformable objects. We propose to learn a generative model of the tool-use trajectories as a sequence of tool point clouds, which generalizes to different tool shapes. Given any novel tool, we first generate a tool-use trajectory and then optimize the sequence of tool poses to align with the generated trajectory. We train a single model on four different challenging deformable object manipulation tasks, using demonstration data from only one tool per task. The model generalizes to various novel tools, significantly outperforming baselines. We further test our trained policy in the real world with unseen tools, where it achieves the performance comparable to human. Additional materials can be found on our project website: <a class="link-external link-https" href="https://sites.google.com/view/toolgen" rel="external noopener nofollow">this https URL</a>.

Robotics,Artificial Intelligence

What problem does this paper attempt to address?

This paper aims to address how autonomous systems can adapt and utilize unfamiliar tools to manipulate deformable objects, such as dough, in a flexible manner. The current systems fail to reach human-level intelligence when adapting to new tools. Existing approaches based on affordance often make strong assumptions about the environment and are not suitable for more complex, contact-rich tasks. In this paper, the authors propose a method called ToolGen, which learns a serialized point cloud representation by generating tool usage trajectories, and thus generalizes to tools of different shapes. Given any new tool, ToolGen first generates a tool usage trajectory and then optimizes the tool pose sequence to match the generated trajectory. The model is trained using only demonstration data of a tool for each task, yet it demonstrates significant generalization ability on multiple new tools and performs comparably to humans in the real world. The paper also compares several baseline methods and demonstrates the superior performance of ToolGen in handling different types of tasks, goals, and tools.

Learning Generalizable Tool-use Skills through Trajectory Generation

Generalization in Dexterous Manipulation via Geometry-Aware Multi-Task Learning

RT-Trajectory: Robotic Task Generalization via Hindsight Trajectory Sketches

DiffSkill: Skill Abstraction from Differentiable Physics for Deformable Object Manipulations with Tools

Learning to Design and Use Tools for Robotic Manipulation

ToolGen: Unified Tool Retrieval and Calling via Generation

UniDexGrasp: Universal Robotic Dexterous Grasping via Learning Diverse Proposal Generation and Goal-Conditioned Policy

Any-point Trajectory Modeling for Policy Learning

SoftGPT: Learn Goal-oriented Soft Object Manipulation Skills by Generative Pre-trained Heterogeneous Graph Transformer

Imagine That! Leveraging Emergent Affordances for 3D Tool Synthesis

General Flow as Foundation Affordance for Scalable Robot Learning

Learning Generalizable 3D Manipulation With 10 Demonstrations

Learning the Generalizable Manipulation Skills on Soft-body Tasks via Guided Self-attention Behavior Cloning Policy

Learning Generalizable Dexterous Manipulation from Human Grasp Affordance

Learning Category-Level Generalizable Object Manipulation Policy Via Generative Adversarial Self-Imitation Learning from Demonstrations

RoboGen: Towards Unleashing Infinite Data for Automated Robot Learning via Generative Simulation

Object-Centric Dexterous Manipulation from Human Motion Data

A Task-Learning Strategy for Robotic Assembly Tasks from Human Demonstrations

Learning to Manipulate Tools by Aligning Simulation to Video Demonstration

ToolNet: Using Commonsense Generalization for Predicting Tool Use for Robot Plan Synthesis

A Two-stage Fine-tuning Strategy for Generalizable Manipulation Skill of Embodied AI