Fuzzy Ensembles of Reinforcement Learning Policies for Robotic Systems with Varied Parameters

Abdel Gafoor Haddad,Mohammed B. Mohiuddin,Igor Boiko,Yahya Zweiri

2023-11-09

Abstract:Reinforcement Learning (RL) is an emerging approach to control many dynamical systems for which classical control approaches are not applicable or insufficient. However, the resultant policies may not generalize to variations in the parameters that the system may exhibit. This paper presents a powerful yet simple algorithm in which collaboration is facilitated between RL agents that are trained independently to perform the same task but with different system parameters. The independency among agents allows the exploitation of multi-core processing to perform parallel training. Two examples are provided to demonstrate the effectiveness of the proposed technique. The main demonstration is performed on a quadrotor with slung load tracking problem in a real-time experimental setup. It is shown that integrating the developed algorithm outperforms individual policies by reducing the RMSE tracking error. The robustness of the ensemble is also verified against wind disturbance.

Robotics,Systems and Control

What problem does this paper attempt to address?

The paper is primarily dedicated to addressing the generalization problem of Reinforcement Learning (RL) policies when system parameters change. Specifically, the paper proposes a robust and simple algorithm that uses fuzzy clustering techniques to collaborate multiple independently trained RL agents to tackle control tasks for robotic systems with different parameters. The main contributions of the paper are as follows: 1. **Developed a fuzzy clustering-based RL agent ensemble**: This method allows for the control of dynamic systems, maintaining stable performance even when these systems' parameters change online or offline, without requiring additional training. 2. **Practical experimental validation**: The research team tested the proposed algorithm on a real quadrotor UAV system with a suspended load and demonstrated its effectiveness. Experimental results showed that the ensemble algorithm significantly reduced the Root Mean Square Error (RMSE) compared to individual policies, particularly excelling in tracking error. The paper also discusses related work in the existing field, such as Domain Randomization (DR) and ensemble learning in RL, and points out their respective limitations. Additionally, the authors demonstrate through theoretical analysis, simulation, and experimental results that the proposed method has advantages over traditional DR techniques, especially in handling system parameters that change over time.

Fuzzy Ensembles of Reinforcement Learning Policies for Robotic Systems with Varied Parameters

Robust Deep Reinforcement Learning for Quadcopter Control

Robust Adaptive Ensemble Adversary Reinforcement Learning

Learning Robust Policies via Interpretable Hamilton-Jacobi Reachability-Guided Disturbances

Obtaining Robust Control and Navigation Policies for Multi-Robot Navigation via Deep Reinforcement Learning

Model-assisted Reinforcement Learning of a Quadrotor

Policy ensemble gradient for continuous control problems in deep reinforcement learning

How to Train Your Quadrotor: A Framework for Consistently Smooth and Responsive Flight Control via Reinforcement Learning

Reinforcement Learning in Robotics: Applications and Real-World Challenges

On Training Flexible Robots using Deep Reinforcement Learning

Towards Applicable Reinforcement Learning: Improving the Generalization and Sample Efficiency with Policy Ensemble.

Collision Avoidance and Navigation for a Quadrotor Swarm Using End-to-end Deep Reinforcement Learning

SERL: A Software Suite for Sample-Efficient Robotic Reinforcement Learning

Efficient Domain Coverage for Vehicles with Second-Order Dynamics via Multi-Agent Reinforcement Learning

Data-Efficient Hierarchical Reinforcement Learning for Robotic Assembly Control Applications

Policy Agnostic RL: Offline RL and Online RL Fine-Tuning of Any Class and Backbone

Neural Internal Model Control: Learning a Robust Control Policy via Predictive Error Feedback

Decomposing Control Lyapunov Functions for Efficient Reinforcement Learning

Robotic Search & Rescue via Online Multi-task Reinforcement Learning

Towards Hardware Accelerated Reinforcement Learning for Application-Specific Robotic Control

Sub-optimal Policy Aided Multi-Agent Reinforcement Learning for Flocking Control