TD3LVSL: A Lane-Level Variable Speed Limit Approach Based on Twin Delayed Deep Deterministic Policy Gradient in a Connected Automated Vehicle Environment
Wenqi Lu,Ziwei Yi,Yuanli Gu,Yikang Rui,Bin Ran
DOI: https://doi.org/10.1016/j.trc.2023.104221
2023-01-01
Abstract:Variable speed limit (VSL) control plays a vital role in the emerging connected automated vehicle highway (CAVH) system, which can alleviate recurrent traffic congestion caused by capacity drop and accidents. However, the effect of the VSL on improving traffic efficiency is still controversial. It is necessary to study how to explore the potential benefits of the VSL by balancing its influence on reducing crash risk and enhancing traffic efficiency. To fill the technological gap above, this paper proposes a reinforcement learning-based lane-level VSL (LVSL) control approach to conduct refined traffic control on the mainlines. Firstly, an actor-critic framework is developed to generate and evaluate the discrete speed limits of each lane in continuous action space. To optimize traffic control performance, a hybrid reward function is developed by synchronously considering traffic safety and traffic efficiency of the bottleneck area. Then, to solve the overestimation bias problem of the actor-critic methods caused by function approximation error, a twin delayed deep deterministic policy gradient (TD3) method is introduced to train the framework of the LVSL method. Finally, a real-world recurrent bottleneck of the State Route 91 highway in California is simulated with consideration of the connected automated vehicles to examine the performance of the TD3based LVSL (TD3LVSL) controller. The simulation results reveal that the proposed method is capable of reducing crash risk and improving traffic efficiency synchronously. Compared with the LVSL controller based on the deep deterministic policy gradient approach, the TD3LVSL controller shows better performance in terms of traffic safety and efficiency. These findings indicate that the proposed controller could contribute to future traffic control in a CAVH environment.
What problem does this paper attempt to address?