Abstract:To improve machines’ intelligence, it is necessary for the machines to learn human’s behavior. In this paper, we make a reasonable hypothesis that a human behaves like a linear quadratic regulator whose cost function is unknown to the machine when performing a task. In addition, the system dynamics in many real applications is completely unknown. Therefore, our purpose is to search for an equivalent cost function to the human only from control input and system state data for continuous-time linear human-in-the-loop (HiTL) systems with completely unknown dynamics. An adaptive inverse optimal control (IOC) method is proposed for this purpose, which can help the machine conduct a better understanding for the human behavior and makes it possible to reproduce a similar optimal controller in other environments. Noticing the difficulty of directly obtaining the weighting matrix, an adaptive integral concurrent learning (ICL) algorithm is developed to identify the system matrices and human feedback gain matrix online, which removes the persistent excitation (PE) conditions. Then, the weighting matrix is determined via solving a convex programming problem. Finally, simulation results on the lane-keeping assist system of an intelligent vehicle are presented to demonstrate the validity of the proposed adaptive IOC algorithm. Note to Practitioners —In practice, it is hoped that the machine can work like a human such that it can replace the human to complete certain tasks. However, it is not easy to design corresponding algorithms for the machine because many tests need to be carried out for selecting appropriate parameters. Instead, an effective method is to teach the machine learn the human’s demonstrated behavior. It is noteworthy that the environment (system dynamics) may be not prior knowledge and only system state and control input are measurable. To this end, an adaptive IOC method is developed for imitation learning the human’s behavior, which is implemented online but requires only limited data. The proposed approach can be used in autonomous driving vehicle, service robot, and medical rehabilitation, etc. In future research, we will extent the proposed method to more complex environment.

Statistically Consistent Inverse Optimal Control for Linear-Quadratic Tracking with Random Time Horizon

Statistically consistent inverse optimal control for discrete-time indefinite linear-quadratic systems

A Learning-Based Optimal Tracking Controller for Continuous Linear Systems with Unknown Dynamics: Theory and Case Study

Inverse optimal control for averaged cost per stage linear quadratic regulators

Adaptive Finite-Time Optimised Impedance Control for Robotic Manipulators with State Constraints

Bi-Level-Based Inverse Stochastic Optimal Control

Inverse Optimal Control for Dynamic Systems with Inequality Constraints

Resilient Inverse Optimal Control for Tracking: Overcoming Process Noise Challenges

Adaptive Inverse Optimal Control for Linear Human-in-the-Loop Systems with Completely Unknown Dynamics

A Data-Driven Approach for Inverse Optimal Control

Inverse Optimal Control for Linear Quadratic Tracking with Unknown Target States

Continuous-time inverse quadratic optimal control problem

Inverse Reinforcement Learning in Tracking Control Based on Inverse Optimal Control

Inverse Optimal Control from Incomplete Trajectory Observations

Immersion and Invariance Adaptive Tracking Control for Robot Manipulators with a Novel Modified Scaling Factor Design

A Robustness Analysis of Inverse Optimal Control of Bipedal Walking

A Finite-Horizon Inverse Linear Quadratic Optimal Control Method for Human-in-the-Loop Behavior Learning

A Convex Optimization Approach to Inverse Optimal Control

Inverse Optimal Control as an Errors-in-Variables Problem

3DIOC: Direct Data-Driven Inverse Optimal Control for LTI Systems