Abstract:We present a novel approach for achieving high-precision trajectory tracking control in an unmanned surface vehicle (USV) through utilization of receding horizon reinforcement learning (RHRL). The control architecture for the USV involves a composite of feedforward and feedback components. The feedforward control component is derived directly from the curvature of the reference path and the dynamic model. Feedback control is acquired through application of the RHRL algorithm, effectively addressing the problem of achieving optimal tracking control. The methodology introduced in this paper synergizes with the rolling time domain optimization mechanism, converting the perpetual time domain optimal control predicament into a succession of finite time domain control problems amenable to resolution. In contrast to Lyapunov model predictive control (LMPC) and sliding mode control (SMC), our proposed method employs the RHRL controller, which yields an explicit state feedback control law. This characteristic endows the controller with the dual capabilities of direct offline and online learning deployment. Within each prediction time domain, we employ a time-independent executive–evaluator network structure to glean insights into the optimal value function and control strategy. Furthermore, we substantiate the convergence of the RHRL algorithm in each prediction time domain through rigorous theoretical proof, with concurrent analysis to verify the stability of the closed-loop system. To conclude, USV trajectory control tests are carried out within a simulated environment.

What problem does this paper attempt to address?

This paper attempts to solve the problem of high - precision trajectory tracking control of unmanned surface vehicles (USV) in complex marine environments. Specifically, the author proposes a control method based on Receding Horizon Reinforcement Learning (RHRL), aiming to improve the lateral control precision of USV. The paper mentions that existing control methods such as PID control, fuzzy control, model predictive control (MPC) and sliding mode control (SMC), etc., have certain limitations in achieving optimal trajectory tracking control, especially when dealing with nonlinear systems and environmental disturbances. Therefore, by combining feed - forward and feedback control components and using the RHRL algorithm, this paper proposes a new control architecture to overcome the shortcomings of existing methods. ### Main problems 1. **High - precision trajectory tracking**: How to achieve high - precision trajectory tracking control of USV in complex marine environments? 2. **Optimizing control performance**: How to design a control method that can improve computational efficiency and learning efficiency while ensuring control precision? 3. **Anti - interference ability**: How to enhance the stability and robustness of USV when it is subject to environmental disturbances? ### Solutions 1. **Dynamic deviation model**: First, a dynamic deviation model of USV is constructed, including feed - forward control and feedback control parts. The feed - forward control is directly derived from the curvature and deviation model of the reference path, while the feedback control is achieved by applying the RHRL algorithm. 2. **RHRL algorithm**: The RHRL algorithm based on the receding horizon optimization mechanism is proposed, which transforms the optimal control problem in the infinite - time domain into a series of heuristic dynamic programming problems in the finite - time domain. This method can be not only online - learned but also directly deployed offline. 3. **Convergence and stability analysis**: Through strict theoretical proof, the convergence of the RHRL algorithm in each prediction time domain and the stability of the closed - loop system are analyzed. 4. **Simulation verification**: The USV trajectory control test is carried out in the simulation environment to verify the effectiveness of the proposed method. The experimental results show that compared with the traditional Lyapunov model predictive control (LMPC), the RHRL method has significant advantages in computational efficiency, sample complexity and learning efficiency. ### Key formulas - **Rotation matrix**: \[ R(\theta)=\begin{bmatrix} \cos\theta&-\sin\theta&0\\ \sin\theta&\cos\theta&0\\ 0&0&1 \end{bmatrix} \] - **Dynamics equation**: \[ M\dot{v}+C(v)v + D(v)v+g(\xi)=\kappa \] where \(\kappa = [F_u, F_v, F_r]^T\) represents the thrust of the thruster, \(M\) is the mass matrix, \(C(v)\) is the Coriolis and centrifugal matrix, \(D(v)\) is the damping matrix, and \(g(\xi)\) is the restoring force. - **State equation**: \[ \dot{e}=A_c e + B_{c1}u + B_{c2}\omega_d \] where \(e = [e_y,\dot{e}_y,e_\phi,\dot{e}_\phi]^T\) represents the lateral error state quantity, \(u=\delta_f\) is the control input, and \(\omega_d=\dot{\phi}_d\) is the desired heading angular velocity. - **Performance index function**: \[ V(e(k))=\sum_{l = k}^{k + N-1}L(e(l),u_b(l))+V_f(e(k + N)) \] where \(L(e(l),u_b(l))=e^T(l)Qe(l)+Pu_b^2(l)\), \(Q\) is a positive definite matrix, and \(P\) is a preset positive real number.

USV Trajectory Tracking Control Based on Receding Horizon Reinforcement Learning

Learning-Based Trajectory Tracking Control of USV Based on Multi-Source Data.

Robust Trajectory Tracking Control of Underactuated Unmanned Surface Vehicles with Exponential Stability: Theory and Experimental Validation.

Sliding-Mode Control With Switching-Gain Adaptation for Trajectory Tracking of Underactuated Unmanned Surface Vessels

USV Application Scenario Expansion Based on Motion Control, Path Following and Velocity Planning

Energy-based trajectory tracking control of under-actuated unmanned surface vessels

A Confidence-based Allocation Approach for USV Trajectory Tracking Via Human-Robot Co-Driving

Gender Differences in the Link Between Excessive Drinking and Domain-Specific Cognitive Functioning Among Older Adults

Adaptive Robust Trajectory-Tracking Control for Underactuated USVs with Model Uncertainty and Environmental Disturbance

Reinforcement learning-based finite-time tracking control of an unknown unmanned surface vehicle with input constraints

Collision Avoidance and Path Point Tracking Control for Underactuated Unmanned Surface Vehicles with Unknown Model Nonlinearity

Self‐learning‐based optimal tracking control of an unmanned surface vehicle with pose and velocity constraints

Adaptive Sliding Mode Control Design for Nonlinear Unmanned Surface Vessel Using RBFNN and Disturbance-Observer

Model Predictive Control Based on State Space and Risk Augmentation for Unmanned Surface Vessel Trajectory Tracking

Trajectory Tracking Multi-mode Predictive Control Based on Soft-switching for Unmanned Surface Vehicle

Trajectory Linearization-Based Adaptive PLOS Path Following Control for Unmanned Surface Vehicle with Unknown Dynamics and Rudder Saturation

Deep reinforcement learning with intrinsic curiosity module based trajectory tracking control for USV

Continuous‐time receding‐horizon reinforcement learning and its application to path‐tracking control of autonomous ground vehicles

Data-Driven Performance-Prescribed Reinforcement Learning Control of an Unmanned Surface Vehicle

A Path Following Control Method for Underactuated Unmanned Surface Vehicles Based on Output Redefinition

Efficient Trajectory Planning and Control for USV with Vessel Dynamics and Differential Flatness