Abstract:In this current research, Twin-Delayed DDPG (TD3) algorithm has been used to solve the most challenging virtual Artificial Intelligence application by training a 4-ant-legged robot as an Intelligent Agent to run across a field. Twin-Delayed DDPG (TD3) is an incredibly smart AI model of a Deep Reinforcement Learning which combines the state-of-the-art methods in Artificial Intelligence. These includes Policy gradient, Actor-Critics, and continuous Double Deep Q-Learning. These Deep Reinforcement Learning approaches trained an Intelligent agent to interact with an environment with automatic feature engineering, that is, necessitating minimal domain knowledge. For the implementation of the TD3, we used a two-layer feedforward neural network of 400 and 300 hidden nodes respectively, with Rectified Linear Units (ReLU) as an activation function between each layer for both the Actor and Critics. We, then added a final tanh unit after the output of the Actor. The Critic receives both the state and action as input to the first layer. Both the network parameters were updated using Adam optimizer. The idea behind the Twin-Delayed DDPG (TD3) is to reduce overestimation bias in Deep Q-Learning with discrete actions which are ineffective in an Actor-Critic domain setting. Based on the Maximum Average Reward over the evaluation time-step, our model achieved an approximate maximum of 2364. Therefore, we can truly say that, TD3 has obviously improved on both the learning speed and performance of the Deep Deterministic Policy Gradient (DDPG) in a challenging environment in a continuous control domain.

Deep Reinforcement Learning with Robust Deep Deterministic Policy Gradient

Swap Softmax Twin Delayed Deep Deterministic Policy Gradient

Twin-Delayed Ddpg: A Deep Reinforcement Learning Technique To Model A Continuous Movement Of An Intelligent Robot Agent

Edge Delayed Deep Deterministic Policy Gradient: efficient continuous control for edge scenarios

Twin Delayed Multi-Agent Deep Deterministic Policy Gradient

Robust Motion Control for UAV in Dynamic Uncertain Environments Using Deep Reinforcement Learning.

Softmax Deep Double Deterministic Policy Gradients

Dueling Network Architecture for Multi-Agent Deep Deterministic Policy Gradient

Path Following for Autonomous Ground Vehicle Using DDPG Algorithm: A Reinforcement Learning Approach

Deep deterministic policy gradient algorithm: A systematic review

D3PG: Decomposed Deep Deterministic Policy Gradient for Continuous Control

Self-Driving Via Improved DDPG Algorithm

Deterministic Value-Policy Gradients

Regularly Updated Deterministic Policy Gradient Algorithm

An Automatic Driving Control Method Based on Deep Deterministic Policy Gradient

ETGL-DDPG: A Deep Deterministic Policy Gradient Algorithm for Sparse Reward Continuous Control

Solving Reach-Avoid-Stay Problems Using Deep Deterministic Policy Gradients

A Dynamically Adaptive Approach to Reducing Strategic Interference for Multi-agent Systems

Asynchronous Episodic Deep Deterministic Policy Gradient: Toward Continuous Control in Computationally Complex Environments

Model-Based Ddpg for Motor Control

Robust Multi-Agent Reinforcement Learning via Minimax Deep Deterministic Policy Gradient