Abstract:This paper studies a novel guidance framework of the vehicle against a high-speed and maneuvering target based on deep reinforcement learning (DRL) considering the energy consumption, autopilot lag dynamics, and input saturation, which can effectively cope with the high flight-path angle error flight phase and various uncertainties. The guidance framework proposes an end-to-end mapping transformation between the guidance command and observation states consisting of line-of-sight (LOS) angle, relative distance, and their rate measured by the seeker. At the same time, the observability of the LOS angle and relative distance is included in constructing the reward function. Besides, the relative engagement kinematic model of the interceptor-target is established and combined with the PPO guidance algorithm, jointly described as a Markov decision process (MDP). Notably, the guidance framework is optimized using the improved proximal policy optimization (PPO) algorithm and demonstrated in a simulated terminal phase in the near-space. Specifically, the PPO guidance algorithm is structured by the policy (actor) neural network and the critic neural network, and both are standard fully-connected neural networks. Subsequently, observation states and rewards are fully collected and applied by introducing the experience replay method. Also, the exponential decay learning rate method, mini-batch stochastic gradient ascent (SGA) method, zero-score standardization, and Adam optimizer are proposed to train the reinforcement learning algorithm more efficiently. Moreover, the proposed guidance framework has an excellent generalization capability and guarantees the implementation of fixed and stochastic engagement scenarios, which means that the interceptor can realize the unlearned practical combat scenarios. The robust capacity is indicated and validated using Monte Carlo simulation under various uncertainties. Moreover, the DRL guidance framework can satisfy the onboard application requirement.

Three-Dimensional Guidance Law Design Against Maneuvering Target Via Deep Reinforcement Learning

Homing Guidance Law Design against Maneuvering Targets Based on DDPG

Intercept Strategy for Maneuvering Target Based on Deep Reinforcement Learning

High-dynamic Intelligent Maneuvering Guidance Strategy Via Deep Reinforcement Learning

Deep Reinforcement Learning-Based Differential Game Guidance Law Against Maneuvering Evaders

Deep Recurrent Reinforcement Learning for Intercept Guidance Law under Partial Observability

Learning to Guide: Guidance Law Based on Deep Meta-learning and Model Predictive Path Integral Control

Reinforcement Learning-Based Three-Dimensional Cooperative Guidance Law

Online Trajectory Planning Method for Midcourse Guidance Phase Based on Deep Reinforcement Learning

Proximal policy optimization guidance algorithm for intercepting near-space maneuvering targets

Reinforcement learning for angle-only intercept guidance of maneuvering targets

DRL-based target interception strategy design for an underactuated USV without obstacle collision

Application of the 3D Differential Geometric Guidance Commands

Intelligent maneuver strategy for hypersonic vehicles in three-player pursuit-evasion games via deep reinforcement learning

A hierarchical reinforcement learning method for missile evasion and guidance

Target tracking strategy using deep deterministic policy gradient

Deep Reinforcement Learning-Based Impact Time Control Guidance Law with Constraints on the Field-of-View

A Three-Dimensional Robust Nonlinear Guidance Law Considering Dynamics of Missile Autopilot

Multi-constrained Intelligent Gliding Guidance Via Optimal Control and DQN

A Three-Dimensional Robust Nonlinear Guidance Law Considering Input Dynamics And Uncertainties

Optimal Homing Guidance Law with Autopilot Dynamics