Abstract:Proceedings of the Institution of Mechanical Engineers, Part D: Journal of Automobile Engineering, Ahead of Print. Present the DDPGwP (DDPG with Pretraining) model, grounded in the framework of deep reinforcement learning, designed for autonomous driving decision-making. The model incorporates imitation learning by utilizing expert experience for supervised learning during initial training and weight preservation. A novel loss function is devised, enabling the expert experience to jointly guide the Actor network's update alongside the Critic network while also participating in the Critic network's updates. This approach allows imitation learning to dominate the early stages of training, with reinforcement learning taking the lead in later stages. Employing experience replay buffer separation techniques, we categorize and store collected superior, ordinary, and expert experiences. We select sensor inputs from the TORCS (The Open Racing Car Simulator) simulation platform and conduct experimental validation, comparing the results with the original DDPG, A2C, and PPO algorithms. Experimental outcomes reveal that incorporating imitation learning significantly accelerates early-stage training, reduces blind trial-and-error during initial exploration, and enhances algorithm stability and safety. The experience replay buffer separation technique improves sampling efficiency and mitigates algorithm overfitting. In addition to expediting algorithm training rates, our approach enables the simulated vehicle to learn superior strategies, garnering higher reward values. This demonstrates the superior stability, safety, and policy-making capabilities of the proposed algorithm, as well as accelerated network convergence.

Solving driving policy for autonomous vehicles via AMDP-Q

A Shared Control Approach for Autonomous Vehicles via Driver Behaviors Learning

Learning an Efficient and Safe Policy for Highway Driving Using Supervised Learning and Reinforcement Learning.

An Automatic Driving Control Method Based on Deep Deterministic Policy Gradient

Learning Hierarchical Behavior and Motion Planning for Autonomous Driving.

MPDM: Multipolicy decision-making in dynamic, uncertain environments for autonomous driving

End-to-End Autonomous Driving Decision-Making Solution Based on Pri-TD3

Learning Online Belief Prediction for Efficient POMDP Planning in Autonomous Driving

Policy Iteration Based Approximate Dynamic Programming Toward Autonomous Driving in Constrained Dynamic Environment

Enhanced Safety in Autonomous Driving: Integrating Latent State Diffusion Model for End-to-End Navigation

Research on Autonomous Driving Decision-making Strategies based Deep Reinforcement Learning

High-Speed Ramp Merging Behavior Decision for Autonomous Vehicles Based on Multi-Agent Reinforcement Learning

DQ-GAT: Towards Safe and Efficient Autonomous Driving With Deep Q-Learning and Graph Attention Networks

Hybrid LLM-DDQN based Joint Optimization of V2I Communication and Autonomous Driving

A decision-making of autonomous driving method based on DDPG with pretraining

Enhanced Safety in Autonomous Driving: Integrating a Latent State Diffusion Model for End-to-End Navigation

Situation-aware decision making for autonomous driving on urban road using online POMDP

Deep reinforcement learning for autonomous driving in uncontrolled intersections of Indian roads

Mixed‐Integer Optimal Control via Reinforcement Learning: A Case Study on Hybrid Electric Vehicle Energy Management

Learning Residual Model of Model Predictive Control via Random Forests for Autonomous Driving

Accelerated Primal-Dual Policy Optimization for Safe Reinforcement Learning