Abstract:The learning inefficiency of reinforcement learning (RL) from scratch hinders its practical application towards continuous robotic tracking control, especially for high-dimensional robots. This work proposes a data-informed residual reinforcement learning (DR-RL) based robotic tracking control scheme applicable to robots with high dimensionality. The proposed DR-RL methodology outperforms common RL methods regarding sample efficiency and scalability. Specifically, we first decouple the original robot into low-dimensional robotic subsystems; and further utilize one-step backward (OSBK) data to construct incremental subsystems that are equivalent model-free representations of the above decoupled robotic subsystems. The formulated incremental subsystems allow for parallel learning to relieve computation load and offer us mathematical descriptions of robotic movements for conducting theoretical analysis. Then, we apply DR-RL to learn the tracking control policy, a combination of incremental base policy and incremental residual policy, under a parallel learning architecture. The incremental residual policy uses the guidance from the incremental base policy as the learning initialization and further learns from interactions with environments to endow the tracking control policy with adaptability towards dynamically changing environments. Our proposed DR-RL based tracking control scheme is developed with rigorous theoretical analysis of system stability and weight convergence. The effectiveness of our proposed method is validated numerically on a 7-DoF KUKA iiwa robot manipulator and experimentally on a 3-DoF robot manipulator that would fail for other counterpart RL methods.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is to achieve efficient tracking control in high - dimensional robots. Specifically, traditional reinforcement learning (RL) methods from scratch have the problem of low learning efficiency when applied to continuous robot tracking control tasks, especially when dealing with high - dimensional robots. This not only requires a large amount of training data, but may also lead to mechanical wear and even damage to the robot itself and the environment in practical applications. In addition, the control tasks of high - dimensional robots further exacerbate the sample complexity problem. To overcome these problems, this paper proposes a data - driven residual reinforcement learning (DR - RL) method, aiming to improve sample efficiency and scalability. The main contributions of this method include: 1. **Data - efficient high - dimensional robot tracking control scheme**: A data - efficient and scalable DR - RL tracking control scheme for high - dimensional robots is proposed, which performs better than common RL methods in experimental tasks. 2. **Data - driven incremental sub - systems**: The high - dimensional robot is decomposed into multiple low - dimensional sub - systems through decoupling techniques, and incremental sub - systems are constructed using one - step backward (OSBK) data. These incremental sub - systems not only improve sample efficiency, but also provide a mathematical description of machine motion, facilitating theoretical analysis, and allowing the use of parallel learning architectures to reduce computational complexity. 3. **Proof of system stability and weight convergence**: Based on the constructed incremental sub - systems and the off - policy empirical data used, a theoretical proof of system stability and weight convergence is provided. ### Specific problem description in the paper The paper focuses on high - dimensional robot tracking control tasks (Problem 1), that is, given a desired trajectory \( \mathbf{x}_d\in\mathbb{R}^n \), learn an efficient tracking control strategy \( \mathbf{u}(\mathbf{x}) \) so that the high - dimensional robot can accurately track this trajectory. ### Solutions 1. **Decoupling techniques**: Decompose the high - dimensional robot into multiple low - dimensional sub - systems, and the dynamic equation of each sub - system is: \[ \dot{\mathbf{x}}_i=\mathbf{f}_i + \mathbf{g}_i\mathbf{u}_i,\quad i = 1,2,\ldots,N \] where \( \mathbf{x}_i\in\mathbb{R}^{n_i} \) and \( \mathbf{u}_i\in\mathbb{R}^{m_i} \) are the local state and control input of the \( i \)-th sub - system respectively, \( \mathbf{f}_i\in\mathbb{R}^{n_i} \) is a combination of local internal dynamics and coupling terms, and \( \mathbf{g}_i\in\mathbb{R}^{n_i\times m_i} \) is the local input gain matrix. 2. **Incremental sub - systems**: Use OSBK data to estimate the unknown model knowledge \( \mathbf{f}_i \) and \( \mathbf{g}_i \), and construct incremental sub - systems: \[ \dot{\mathbf{x}}_i=\dot{\mathbf{x}}_{i,0}+\bar{\mathbf{g}}_i(\Delta\mathbf{u}_i+\boldsymbol{\xi}_i) \] where \( \Delta\mathbf{u}_i=\mathbf{u}_i-\mathbf{u}_{i,0} \) is the incremental policy, and \( \boldsymbol{\xi}_i \) is the estimation error. 3. **DR - RL tracking control scheme**: Under the parallel learning architecture, the incremental policy \( \Delta\mathbf{u}_i \) is designed as a combination of an incremental base policy \( \Delta\mathbf{u}_{ib} \) and an incremental residual policy \( \Delta\mathbf{u}_{ir} \): \[ \Delta\mathbf{u}_i =

Data Informed Residual Reinforcement Learning for High-Dimensional Robotic Tracking Control

Residual Reinforcement Learning for Robot Control

An Integrated Tracking Control Approach Based on Reinforcement Learning for a Continuum Robot in Space Capture Missions

Learning with Training Wheels: Speeding up Training with a Simple Controller for Deep Reinforcement Learning

Reinforcement Learning Tracking Control for Robotic Manipulator With Kernel-Based Dynamic Model

Modelling, Positioning, and Deep Reinforcement Learning Path Tracking Control of Scaled Robotic Vehicles: Design and Experimental Validation

A Novel Guided Deep Reinforcement Learning Tracking Control Strategy for Multirotors

Model-Based Reinforcement Learning Inspired by Augmented PD for Robotic Control

Trajectory tracking control based on deep reinforcement learning and ensemble random network distillation for robotic manipulator

Integrating DeepRL with Robust Low-Level Control in Robotic Manipulators for Non-Repetitive Reaching Tasks

A Reinforcement Learning Approach for Continuum Robot Control

Decentralized Tracking Control for Modular Reconfigurable Robots Using Data-Based Concurrent Learning

Offline Reinforcement Learning of Robotic Control Using Deep Kinematics and Dynamics

Data-efficient Deep Reinforcement Learning Method Toward Scaling Continuous Robotic Task with Sparse Rewards.

Learning Force Control for Contact-Rich Manipulation Tasks With Rigid Position-Controlled Robots

Optimal Tracking Control of Mechatronic Servo System Using Integral Reinforcement Learning

Enhancing Continuous Control of Mobile Robots for End-to-End Visual Active Tracking

Learning of Long-Horizon Sparse-Reward Robotic Manipulator Tasks With Base Controllers

Deep Reinforcement Learning for Motion Control Algorithms in Robotics

Design and Experimental Validation of Deep Reinforcement Learning-Based Fast Trajectory Planning and Control for Mobile Robot in Unknown Environment

DRL-Based Trajectory Tracking for Motion-Related Modules in Autonomous Driving