Abstract:Temporal action abstractions, along with belief state representations, are a powerful knowledge sharing mechanism for sequential decision making. In this work, we propose a novel view that treats inducing temporal action abstractions as a sequence compression problem. To do so, we bring a subtle but critical component of LLM training pipelines -- input tokenization via byte pair encoding (BPE) -- to the seemingly distant task of learning skills of variable time span in continuous control domains. We introduce an approach called Primitive Sequence Encoding (PRISE) that combines continuous action quantization with BPE to learn powerful action abstractions. We empirically show that high-level skills discovered by PRISE from a multitask set of robotic manipulation demonstrations significantly boost the performance of both multitask imitation learning as well as few-shot imitation learning on unseen tasks. Our code is released at <a class="link-external link-https" href="https://github.com/FrankZheng2022/PRISE" rel="external noopener nofollow">this https URL</a>.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is in the field of continuous control (such as robotic manipulation), how to improve the performance of Behavior Cloning (BC) in downstream tasks by learning temporal action abstractions. Specifically, the paper proposes a new method named Primitive Sequence Encoding (PRISE). It views the learning of temporal action abstractions as a sequence compression problem and draws on discrete coding and sequence compression techniques in Natural Language Processing (NLP), especially Byte Pair Encoding (BPE), to learn efficient temporal action abstractions from multi - task offline demonstration datasets. ### Main Problems and Solutions 1. **High - Dimensional Observations and Complex Continuous Action Spaces** - In sequential decision - making problems, especially in robotic manipulation scenarios, high - dimensional observations (such as images) and complex continuous action spaces are often encountered. These problems make it difficult to directly learn effective policies. - **Solution**: By constructing abstractions - that is, compact belief states and action representations that can be generalized across different tasks, making learning in new scenarios more robust and data - efficient. 2. **Learning of Temporal Action Abstractions** - Learning temporal action abstractions (such as representations of multi - step primitive behaviors) has not fully benefited from successful methods in other fields, especially in continuous control. - **Solution**: The paper proposes to apply discrete coding and sequence compression techniques to the learning of temporal action abstractions. Specifically, by quantizing continuous actions into discrete codes and applying the BPE algorithm to identify variable - duration action primitives (skills) with the desired properties. 3. **Improving the Learning Efficiency of Downstream Tasks** - The temporal action abstractions learned from multi - task robotic manipulation demonstrations using PRISE significantly improve the performance of behavior cloning in downstream tasks. - **Solution**: By introducing the PRISE method, the continuous action space is quantized into discrete codes, and then the BPE algorithm is used to extract temporally extended action primitives from these discrete code sequences. These primitives show better performance in downstream tasks, especially in behavior cloning. ### Main Contributions of PRISE - **Innovatively Combining NLP Methods**: Applying discrete coding and sequence compression techniques in NLP (such as BPE) to the learning of temporal action abstractions in the field of continuous control. - **Improving Learning Efficiency**: Through the learned temporal action abstractions, PRISE significantly improves the performance of behavior cloning in downstream tasks, surpassing some existing strong baseline methods. - **Detailed Experimental Verification**: The effectiveness and superiority of PRISE are verified through experiments on multiple benchmark datasets. In summary, this paper aims to solve the problem of learning temporal action abstractions in the field of continuous control and effectively improves the performance of behavior cloning in downstream tasks by introducing the PRISE method.

PRISE: LLM-Style Sequence Compression for Learning Temporal Action Abstractions in Control

Embodied Executable Policy Learning with Language-based Scene Summarization

From LLMs to Actions: Latent Codes as Bridges in Hierarchical Robot Control

QueST: Self-Supervised Skill Abstractions for Learning Continuous Control

LISA: Learning Interpretable Skill Abstractions from Language

Autoregressive Action Sequence Learning for Robotic Manipulation

Seq2Time: Sequential Knowledge Transfer for Video LLM Temporal Grounding

LEAGUE: Guided Skill Learning and Abstraction for Long-Horizon Manipulation

CompILE: Compositional Imitation Learning and Execution

Language-Conditioned Imitation Learning with Base Skill Priors under Unstructured Data

Skill Induction and Planning with Latent Language

SAM-E: Leveraging Visual Foundation Model with Sequence Imitation for Embodied Manipulation

Abstract Spatial-Temporal Reasoning Via Probabilistic Abduction and Execution

Reinforcement learning under temporal logic constraints as a sequence modelling problem

Active Learning of Abstract Plan Feasibility

Chain-of-Thought Predictive Control

Action-Quantized Offline Reinforcement Learning for Robotic Skill Learning

Learning for Long-Horizon Planning via Neuro-Symbolic Abductive Imitation

LACMA: Language-Aligning Contrastive Learning with Meta-Actions for Embodied Instruction Following

Learning Planning Abstractions from Language

PRIME: Scaffolding Manipulation Tasks with Behavior Primitives for Data-Efficient Imitation Learning