MTLight: Efficient Multi-Task Reinforcement Learning for Traffic Signal Control

Liwen Zhu,Peixi Peng,Zongqing Lu,Yonghong Tian
2024-04-01
Abstract:Traffic signal control has a great impact on alleviating traffic congestion in modern cities. Deep reinforcement learning (RL) has been widely used for this task in recent years, demonstrating promising performance but also facing many challenges such as limited performances and sample inefficiency. To handle these challenges, MTLight is proposed to enhance the agent observation with a latent state, which is learned from numerous traffic indicators. Meanwhile, multiple auxiliary and supervisory tasks are constructed to learn the latent state, and two types of embedding latent features, the task-specific feature and task-shared feature, are used to make the latent state more abundant. Extensive experiments conducted on CityFlow demonstrate that MTLight has leading convergence speed and asymptotic performance. We further simulate under peak-hour pattern in all scenarios with increasing control difficulty and the results indicate that MTLight is highly adaptable.
Artificial Intelligence
What problem does this paper attempt to address?
### Problems the Paper Aims to Solve This paper aims to address the issue of multi-agent reinforcement learning in urban traffic signal control. Specifically: 1. **Performance Improvement**: Enhance the performance of traffic signal control tasks, especially in large-scale road networks, by introducing a method called MTL IGHT. 2. **Sample Efficiency**: Propose a multi-task reinforcement learning framework that utilizes latent states to improve sample efficiency and enhance policy performance. 3. **Environmental Adaptability**: Achieve better adaptability and stability under different traffic flow configurations by learning latent states that contain global information. The paper validates the effectiveness of the MTL IGHT method through a series of experiments, demonstrating its excellent performance under both real and synthetic traffic flow configurations in multiple cities, with faster convergence speeds. Additionally, by analyzing the performance of different methods under various traffic patterns, the paper further illustrates the rationality and robustness of MTL IGHT in policy selection.