Abstract:There has been substantial growth in research on the robot automation, which aims to make robots capable of directly interacting with the world or human. Robot learning for automation from human demonstration is central to such situation. However, the dependence of demonstration restricts robot to a fixed scenario, without the ability to explore in variant situations to accomplish the same task as in demonstration. Deep reinforcement learning methods may be a good method to make robot learning beyond human demonstration and fulfilling the task in unknown situations. The exploration is the core of such generalization to different environments. While the exploration in reinforcement learning may be ineffective and suffer from the problem of low sample efficiency. In this paper, we present Evolutionary Policy Gradient (EPG) to make robot learn from demonstration and perform goal oriented exploration efficiently. Through goal oriented exploration, our method can generalize robot learned skill to environments with different parameters. Our Evolutionary Policy Gradient combines parameter perturbation with policy gradient method in the framework of Evolutionary Algorithms (EAs) and can fuse the benefits of both, achieving effective and efficient exploration. With demonstration guiding the evolutionary process, robot can accelerate the goal oriented exploration to generalize its capability to variant scenarios. The experiments, carried out in robot control tasks in OpenAI Gym with dense and sparse rewards, show that our EPG is able to provide competitive performance over the original policy gradient methods and EAs. In the manipulator task, our robot can learn to open the door with vision in environments which are different from where the demonstrations are provided.

A Versatile Agent for Fast Learning from Human Instructors

Generalize Robot Learning from Demonstration to Variant Scenarios with Evolutionary Policy Gradient

Policy Stitching: Learning Transferable Robot Policies

Sparse Diffusion Policy: A Sparse, Reusable, and Flexible Policy for Robot Learning

Selective Policy Transfer in Multi-Agent Systems with Sparse Interactions

Efficient Robot Skill Learning with Imitation from a Single Video for Contact-Rich Fabric Manipulation

Integrating human learning and reinforcement learning: A novel approach to agent training

Robot learning on the job: Human-in-the-loop autonomy and learning during deployment

Learning and generalization of task-parameterized skills through few human demonstrations

EquiBot: SIM(3)-Equivariant Diffusion Policy for Generalizable and Data Efficient Learning

NeuronsGym: A Hybrid Framework and Benchmark for Robot Tasks with Sim2Real Policy Learning

EXTRACT: Efficient Policy Learning by Extracting Transferable Robot Skills from Offline Data

Extending Policy from One-Shot Learning through Coaching

Learning Transferable Motor Skills with Hierarchical Latent Mixture Policies

One-shot sim-to-real transfer policy for robotic assembly via reinforcement learning with visual demonstration

Make-An-Agent: A Generalizable Policy Network Generator with Behavior-Prompted Diffusion

Discrete Policy: Learning Disentangled Action Space for Multi-Task Robotic Manipulation

Human-Agent Joint Learning for Efficient Robot Manipulation Skill Acquisition

An Efficient Model-Based Approach on Learning Agile Motor Skills without Reinforcement