Abstract:One unresolved issue is how to scale model-based inverse reinforcement learning (IRL) to actual robotic manipulation tasks with unpredictable dynamics. The ability to learn from both visual and proprioceptive examples, creating algorithms that scale to high-dimensional state-spaces, and mastering strong dynamics models are the main obstacles. In this work, we provide a gradient-based inverse reinforcement learning framework that learns cost functions purely from visual human demonstrations. The shown behavior and the trajectory is then optimized using TD visual model predictive control(MPC) and the learned cost functions. We test our system using fundamental object manipulation tasks on hardware.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is how to extend model - based Inverse Reinforcement Learning (IRL) to actual robotic manipulation tasks, especially in dynamically unpredictable situations. The main challenges include: 1. **Learning from visual and proprioceptive examples**: How to extract useful information from human demonstrations and transform it into a form that robots can understand, especially for tasks in high - dimensional state - space. 2. **Powerful dynamic models**: How to construct a dynamic model that can accurately predict the results of robotic actions, especially in low - dimensional feature - representation spaces. 3. **Complexity of the optimization problem**: In model - based Inverse Reinforcement Learning, there are two nested optimization problems - the inner optimization problem is to optimize the policy given the cost function and the transition model, and the outer optimization problem is to match the policy with the observed demonstrations by maximizing the cost function. This step is very difficult because it is necessary to measure the influence of changes in cost function parameters on the final policy parameters. To solve these problems, the authors propose a gradient - based Inverse Reinforcement Learning framework that can learn the cost function from visual human demonstrations and use TD - Visual Model Predictive Control (TD - MPC) to optimize the trajectory. Specifically, this method includes the following key steps: - **Key - point detector**: Train a key - point detector to extract low - dimensional visual features (key - points) from RGB image inputs. These key - points can represent important positions and areas in the image. - **Dynamic model**: Use a pre - trained dynamic model to predict the next key - point and joint state of the robot after performing a specific action. - **Gradient optimization**: Calculate the gradient of the cost function parameters with respect to the inner optimization process through gradient optimization techniques, thereby achieving more stable and efficient optimization. In the experimental part, the authors tested this method on the Franka Panda robotic arm and verified its effectiveness and robustness in basic object manipulation tasks.

Robotic Arm Manipulation with Inverse Reinforcement Learning & TD-MPC

Model-Based Inverse Reinforcement Learning from Visual Demonstrations

Model Predictive Control for Constrained Robot Manipulator Visual Servoing Tuned by Reinforcement Learning.

Dynamic Non-Prehensile Object Transport via Model-Predictive Reinforcement Learning

Dynamical Obstacle Avoidance of Task- Constrained Mobile Manipulation Using Model Predictive Control

Exploiting Symmetry and Heuristic Demonstrations in Off-policy Reinforcement Learning for Robotic Manipulation

Randomized-to-Canonical Model Predictive Control for Real-World Visual Robotic Manipulation

Dexterous Manipulation from Images: Autonomous Real-World RL via Substep Guidance

Motion Planning and Obstacle Avoidance for Robot Manipulators Using Model Predictive Control-based Reinforcement Learning

Nonprehensile Riemannian Motion Predictive Control

A High-Efficient Reinforcement Learning Approach for Dexterous Manipulation

Off-Policy Deep Reinforcement Learning Algorithms for Handling Various Robotic Manipulator Tasks

Precise and Dexterous Robotic Manipulation via Human-in-the-Loop Reinforcement Learning

Learning-based Control for Tendon-Driven Continuum Robotic Arms

Cooperative Distributed Model Predictive Control for Robot In-Hand Manipulation

DA-VIL: Adaptive Dual-Arm Manipulation with Reinforcement Learning and Variable Impedance Control

Waypoint-Based Reinforcement Learning for Robot Manipulation Tasks

Sampling-Based Model Predictive Control for Dexterous Manipulation on a Biomimetic Tendon-Driven Hand

Robust tube-based MPC with smooth computation for dexterous robot manipulation

Enhancing Task Performance of Learned Simplified Models via Reinforcement Learning

Bayesian Multi-Task Learning MPC for Robotic Mobile Manipulation