A Deep Reinforcement Learning Approach for Optimized Speed Planning of Connected and Autonomous Vehicles

Xinxin Wang,Fangfei Li,Yang Tang,Xianlun Peng,Jingjie Ni
DOI: https://doi.org/10.1109/tiv.2024.3477940
IF: 8.2
2024-01-01
IEEE Transactions on Intelligent Vehicles
Abstract:Intelligent transportation presents a novel approach to alleviate the impact of automobiles on traffic, energy consumption, and the environment. The rapid advancement and increasing prominence of Connected and Autonomous Vehicles (CAVs) technology offer potential advantages in improving traffic flow and optimizing energy usage. In this study, we propose enhancements in deep reinforcement learning techniques to address the optimal speed planning problem for CAVs at signalized intersections. We transform the CAV speed planning problem into a Markov Decision Process (MDP) framework. During this transformation, we introduce several enhancements and optimizations to the standard MDP modeling approach, which refine the state-action definitions, reward structures, and transition dynamics, tailored to the specific requirements of CAV behavior at signalized intersections. Through state space design, we achieve dimensionality reduction compared to prior studies, leading to more efficient storage utilization. Besides, instead of using speed limit violations as a training termination criterion, we integrate them into the reward structure, to ensure continuous learning. Additionally, we provide theoretical analysis on reward design to confirm that adherence to defined conditions guarantees the acquisition of the optimal policy within the MDP framework. Our simulation results indicate that the utilization of the Deep Deterministic Policy Gradient (DDPG) algorithm significantly enhances training efficiency and reduces learning time. We also conducted a comparative analysis of various reinforcement learning algorithms applied to our proposed model. Furthermore, our model outperforms the Intelligent Driver Model (IDM) in terms of travel time and energy consumption, demonstrating its practical advantages and effectiveness
What problem does this paper attempt to address?