Abstract:Using direct reinforcement learning (RL) to accomplish a task can be very inefficient, especially in robotic configurations where interactions with the environment are lengthy and costly. Instead, learning from expert demonstration (LfD) is an alternative approach to gain better performance in an RL setting, which also greatly improves sample efficiency. We propose a novel demonstration learning framework for actor-critic based algorithms. Firstly, we put forward an environment pre-training paradigm to initialize the model parameters without interacting with the target environment, which effectively avoids the cold start problem in deep RL scenarios. Secondly, we design a general-purpose LfD framework for most of the mainstream actor-critic RL algorithms that include a policy network and a value function like PPO, SAC, TRPO, A3C. Thirdly, we build a dedicated model training platform to perform the human-robot interaction and numerical experimentation. We evaluate the method in six Mujoco simulated locomotion environments and our robot control simulation platform. Results show that several epochs of pre-training can improve the agent’s performance over the early stage of training. Also, the final converged performance of the RL algorithm is also boosted by external demonstration. In general the sample efficiency is improved by 30% with the proposed method. Our demonstration pipeline makes full use of the exploration property of the RL algorithm, which is feasible for fast teaching robots in dynamic environments.

Content Classification Tasks with Data Preprocessing Manifestations

Pretraining Representations for Data-Efficient Reinforcement Learning

Pre-training with Non-expert Human Demonstration for Deep Reinforcement Learning

Become a Proficient Player with Limited Data through Watching Pure Videos

Jointly Pre-training with Supervised, Autoencoder, and Value Losses for Deep Reinforcement Learning

Pre-training Neural Networks with Human Demonstrations for Deep Reinforcement Learning

Learning Visual Robotic Control Efficiently with Contrastive Pre-training and Data Augmentation

Automated Image Data Preprocessing with Deep Reinforcement Learning

Deep reinforcement learning from human preferences

A Data-Efficient Training Method for Deep Reinforcement Learning

Demonstration Guided Actor-Critic Deep Reinforcement Learning for Fast Teaching of Robots in Dynamic Environments

Emergent Solutions to High-Dimensional Multitask Reinforcement Learning

Behavior From the Void: Unsupervised Active Pre-Training

Data Efficient Training for Reinforcement Learning with Adaptive Behavior Policy Sharing

State of the Art Control of Atari Games Using Shallow Reinforcement Learning

A Data-efficiency Training Framework for Deep Reinforcement Learning

Pretraining in Deep Reinforcement Learning: A Survey

Diagnosing and exploiting the computational demands of videos games for deep reinforcement learning

Pretraining & Reinforcement Learning: Sharpening the Axe Before Cutting the Tree

Object-sensitive Deep Reinforcement Learning