Abstract:Human brain and behavior provide a rich venue that can inspire novel control and learning methods for robotics. In an attempt to exemplify such a development by inspiring how humans acquire knowledge and transfer skills among tasks, we introduce a novel multi-task reinforcement learning framework named Episodic Return Progress with Bidirectional Progressive Neural Networks (ERP-BPNN). The proposed ERP-BPNN model (1) learns in a human-like interleaved manner by (2) autonomous task switching based on a novel intrinsic motivation signal and, in contrast to existing methods, (3) allows bidirectional skill transfer among tasks. ERP-BPNN is a general architecture applicable to several multi-task learning settings; in this paper, we present the details of its neural architecture and show its ability to enable effective learning and skill transfer among morphologically different robots in a reaching task. The developed Bidirectional Progressive Neural Network (BPNN) architecture enables bidirectional skill transfer without requiring incremental training and seamlessly integrates with online task arbitration. The task arbitration mechanism developed is based on soft Episodic Return progress (ERP), a novel intrinsic motivation (IM) signal. To evaluate our method, we use quantifiable robotics metrics such as 'expected distance to goal' and 'path straightness' in addition to the usual reward-based measure of episodic return common in reinforcement learning. With simulation experiments, we show that ERP-BPNN achieves faster cumulative convergence and improves performance in all metrics considered among morphologically different robots compared to the baselines.

Temporally extended successor feature neural episodic control

Successor Feature Neural Episodic Control

Continual Reinforcement Learning with Multi-Timescale Successor Features

Episodic Reinforcement Learning with Expanded State-reward Space

Advantages and Limitations of using Successor Features for Transfer in Reinforcement Learning

Continuous Episodic Control

Unleashing the Potential of Spiking Neural Networks for Sequential Modeling with Contextual Embedding.

Dual Memory Model for Experience-Once Task-Incremental Lifelong Learning.

Learning Temporally Extended Skills in Continuous Domains as Symbolic Actions for Planning

Learning a World Model With Multitimescale Memory Augmentation

Multitask Neuroevolution for Reinforcement Learning with Long and Short Episodes

Sequential memory improves sample and memory efficiency in Episodic Control

Neural Episodic Control with State Abstraction

Efficient Reinforcement-Learning Control Algorithm Using Experience Reuse

Episodic Reinforcement Learning with Associative Memory.

Episodic Memory for Learning Subjective-Timescale Models

Learning Successor Features the Simple Way

Open the Black Box: Step-based Policy Updates for Temporally-Correlated Episodic Reinforcement Learning

Episodic Future Thinking Mechanism for Multi-agent Reinforcement Learning

Deep Reinforcement Learning with Parametric Episodic Memory

Bidirectional Progressive Neural Networks with Episodic Return Progress for Emergent Task Sequencing and Robotic Skill Transfer