Training a Reinforcement Learning Agent with AutoRL for Traffic Signal Control

Yanjiao Xu,Yuxin Wang,Chanjuan Liu
DOI: https://doi.org/10.1109/fcsit57414.2022.00022
2022-01-01
Abstract:Traffic signal control system affects the efficiency of transportation. Adaptive traffic signal control has attracted attention for automatically adjusting the phase according to different traffic conditions. Researchers have been trying to apply deep reinforcement learning to the design of adaptive traffic signal control. The meta-parameter like $\gamma$ , the discount rate of future reward, is crucial in reinforcement learning. Researchers, especially across disciplines, need to conduct extensive experiments to find appropriate values for the meta-parameters. Automated Reinforcement Learning (AutoRL) can automate the design choices of meta-parameters instead of manual tuning. The gradient-based meta-learning algorithm belongs to AutoRL. To save time in exploring meta-parameters, we integrate the gradient-based meta-learning algorithm into DQN (GBML-DQN). We conduct experiments on the traffic simulator SUMO. Our results show that GBML-DQN promotes convergence of the Q-value function and avoids overestimation to some extent, especially in the case of inappropriate training settings, whereas DQN fails in training.
What problem does this paper attempt to address?