Abstract:Reinforcement learning (RL) trains agents to accomplish complex tasks through environmental interaction data, but its capacity is also limited by the scope of the available data. To obtain a knowledgeable agent, a promising approach is to leverage the knowledge from large language models (LLMs). Despite previous studies combining LLMs with RL, seamless integration of the two components remains challenging due to their semantic gap. This paper introduces a novel method, Knowledgeable Agents from Language Model Rollouts (KALM), which extracts knowledge from LLMs in the form of imaginary rollouts that can be easily learned by the agent through offline reinforcement learning methods. The primary challenge of KALM lies in LLM grounding, as LLMs are inherently limited to textual data, whereas environmental data often comprise numerical vectors unseen to LLMs. To address this, KALM fine-tunes the LLM to perform various tasks based on environmental data, including bidirectional translation between natural language descriptions of skills and their corresponding rollout data. This grounding process enhances the LLM's comprehension of environmental dynamics, enabling it to generate diverse and meaningful imaginary rollouts that reflect novel skills. Initial empirical evaluations on the CLEVR-Robot environment demonstrate that KALM enables agents to complete complex rephrasings of task goals and extend their capabilities to novel tasks requiring unprecedented optimal behaviors. KALM achieves a success rate of 46% in executing tasks with unseen goals, substantially surpassing the 26% success rate achieved by baseline methods. Furthermore, KALM effectively enables the LLM to comprehend environmental dynamics, resulting in the generation of meaningful imaginary rollouts that reflect novel skills and demonstrate the seamless integration of large language models and reinforcement learning.

Plan-Seq-Learn: Language Model Guided RL for Solving Long Horizon Robotics Tasks

RePLan: Robotic Replanning with Perception and Language Models

Lifelong Robot Learning with Human Assisted Language Planners

SRLM: Human-in-Loop Interactive Social Robot Navigation with Large Language Model and Deep Reinforcement Learning

Language-Conditioned Offline RL for Multi-Robot Navigation

LGR2: Language Guided Reward Relabeling for Accelerating Hierarchical Reinforcement Learning

Synthesizing Programmatic Reinforcement Learning Policies with Large Language Model Guided Search

Accelerating Reinforcement Learning of Robotic Manipulations via Feedback from Large Language Models

Knowledgeable Agents by Offline Reinforcement Learning from Large Language Model Rollouts

Language Models as Zero-Shot Trajectory Generators

Grounding Language Models in Autonomous Loco-manipulation Tasks

3P-LLM: Probabilistic Path Planning using Large Language Model for Autonomous Robot Navigation

SELP: Generating Safe and Efficient Task Plans for Robot Agents with Large Language Models

Leveraging Environment Interaction for Automated PDDL Translation and Planning with Large Language Models

Long-horizon Locomotion and Manipulation on a Quadrupedal Robot with Large Language Models

Lifelong Robot Library Learning: Bootstrapping Composable and Generalizable Skills for Embodied Control with Language Models

ISR-LLM: Iterative Self-Refined Large Language Model for Long-Horizon Sequential Task Planning

Sequential Planning in Large Partially Observable Environments guided by LLMs

RLingua: Improving Reinforcement Learning Sample Efficiency in Robotic Manipulations With Large Language Models

Prompt, Plan, Perform: LLM-based Humanoid Control via Quantized Imitation Learning

LEAGUE: Guided Skill Learning and Abstraction for Long-Horizon Manipulation