LLM-Craft: Robotic Crafting of Elasto-Plastic Objects with Large Language Models

Alison Bartsch,Amir Barati Farimani

2024-10-01

Abstract:When humans create sculptures, we are able to reason about how geometrically we need to alter the clay state to reach our target goal. We are not computing point-wise similarity metrics, or reasoning about low-level positioning of our tools, but instead determining the higher-level changes that need to be made. In this work, we propose LLM-Craft, a novel pipeline that leverages large language models (LLMs) to iteratively reason about and generate deformation-based crafting action sequences. We simplify and couple the state and action representations to further encourage shape-based reasoning. To the best of our knowledge, LLM-Craft is the first system successfully leveraging LLMs for complex deformable object interactions. Through our experiments, we demonstrate that with the LLM-Craft framework, LLMs are able to successfully reason about the deformation behavior of elasto-plastic objects. Furthermore, we find that LLM-Craft is able to successfully create a set of simple letter shapes. Finally, we explore extending the framework to reaching more ambiguous semantic goals, such as "thinner" or "bumpy". For videos please see our website: <a class="link-external link-https" href="https://sites.google.com/andrew.cmu.edu/llmcraft" rel="external noopener nofollow">this https URL</a>.

Robotics

What problem does this paper attempt to address?

The problem that this paper attempts to solve is how to use large - language models (LLMs) to achieve complex deformation operations on elastoplastic objects (such as clay), specifically through robots for sculpture creation. Traditional methods usually focus on low - level dynamic prediction or direct motion imitation. Although these methods can achieve certain goals, they lack high - level understanding and reasoning ability of material behavior. This paper proposes a new framework - LLM - Craft, which utilizes the powerful reasoning ability of large - language models. By simplifying state and action representations, LLM can perform higher - level geometric reasoning, thus generating effective sequences of deformation operations. This is not only to achieve specific shape goals, but also to explore how to use LLM to handle more ambiguous semantic goals, such as "thinner", "rougher", etc. The key contributions of the paper are: 1. Proposing the first system that successfully uses LLM to operate elastoplastic objects in the real world. 2. Exploring the reasoning ability of LLM at the semantic level and its help for sculpture tasks. 3. Proving that through carefully designed prompt engineering, LLM can successfully reason about complex interactions between robots and objects. In conclusion, this research aims to show the potential of LLM in handling complex tasks that require high - level understanding, especially for those tasks that require understanding of material behavior and long - term planning.

LLM-Craft: Robotic Crafting of Elasto-Plastic Objects with Large Language Models

CRAFT: Customizing LLMs by Creating and Retrieving from Specialized Toolsets

RoboCraft: Learning to See, Simulate, and Shape Elasto-Plastic Objects with Graph Networks

On the Exploration of LM-Based Soft Modular Robot Design

LLM-Based Human-Robot Collaboration Framework for Manipulation Tasks

ManipLLM: Embodied Multimodal Large Language Model for Object-Centric Robotic Manipulation

Large Language Model Agent as a Mechanical Designer

Kinematic-aware Prompting for Generalizable Articulated Object Manipulation with LLMs

Creative Robot Tool Use with Large Language Models

SculptBot: Pre-Trained Models for 3D Deformable Object Manipulation

Empowering Large Language Models on Robotic Manipulation with Affordance Prompting

CREATOR: Tool Creation for Disentangling Abstract and Concrete Reasoning of Large Language Models

DART-LLM: Dependency-Aware Multi-Robot Task Decomposition and Execution using Large Language Models

RoPotter: Toward Robotic Pottery and Deformable Object Manipulation with Structural Priors

Mediating Modes of Thought: LLM's for design scripting

Large Language Models for Orchestrating Bimanual Robots

Luminate: Structured Generation and Exploration of Design Space with Large Language Models for Human-AI Co-Creation

LLaMP: Large Language Model Made Powerful for High-fidelity Materials Knowledge Retrieval and Distillation

LLM Discussion: Enhancing the Creativity of Large Language Models via Discussion Framework and Role-Play

LaMI: Large Language Models for Multi-Modal Human-Robot Interaction

Lifelong Robot Library Learning: Bootstrapping Composable and Generalizable Skills for Embodied Control with Language Models