Diffusion Augmented Agents: A Framework for Efficient Exploration and Transfer Learning

Norman Di Palo,Leonard Hasenclever,Jan Humplik,Arunkumar Byravan
2024-07-30
Abstract:We introduce Diffusion Augmented Agents (DAAG), a novel framework that leverages large language models, vision language models, and diffusion models to improve sample efficiency and transfer learning in reinforcement learning for embodied agents. DAAG hindsight relabels the agent's past experience by using diffusion models to transform videos in a temporally and geometrically consistent way to align with target instructions with a technique we call Hindsight Experience Augmentation. A large language model orchestrates this autonomous process without requiring human supervision, making it well-suited for lifelong learning scenarios. The framework reduces the amount of reward-labeled data needed to 1) finetune a vision language model that acts as a reward detector, and 2) train RL agents on new tasks. We demonstrate the sample efficiency gains of DAAG in simulated robotics environments involving manipulation and navigation. Our results show that DAAG improves learning of reward detectors, transferring past experience, and acquiring new tasks - key abilities for developing efficient lifelong learning agents. Supplementary material and visualizations are available on our website <a class="link-external link-https" href="https://sites.google.com/view/diffusion-augmented-agents/" rel="external noopener nofollow">this https URL</a>
Machine Learning,Artificial Intelligence,Robotics
What problem does this paper attempt to address?
### Problems the Paper Aims to Solve The paper primarily aims to address issues of data efficiency and transfer learning in reinforcement learning scenarios, particularly in robotic manipulation and navigation tasks. Specifically, the authors propose a new framework called "Diffusion Augmented Agents (DAAG)" to improve data efficiency through the following methods: 1. **Improving Sample Efficiency**: By leveraging pre-trained large-scale language models, vision-language models, and diffusion models to enhance sample efficiency, thereby reducing the amount of data that needs reward annotation. 2. **Experience Reuse and Transfer**: Through a technique called "Hindsight Experience Augmentation," which uses diffusion models to modify videos with spatiotemporal consistency to better match target instructions, enabling experiences to be effectively transferred across different tasks. 3. **Autonomous Learning Capability**: The entire framework operates without human supervision, capable of autonomously setting and evaluating sub-goals, and reusing past experiences to accelerate learning of new tasks. A series of experiments validated the effectiveness of DAAG in simulated robotic environments, particularly demonstrating significant advantages in task reward detection, experience transfer, and learning new tasks. These capabilities are crucial for developing efficient lifelong learning agents.