UniTSA: A Universal Reinforcement Learning Framework for V2X Traffic Signal Control

Maonan Wang,Xi Xiong,Yuheng Kan,Chengcheng Xu,Man-On Pun
DOI: https://doi.org/10.1109/TVT.2024.3403879
2023-12-08
Abstract:Traffic congestion is a persistent problem in urban areas, which calls for the development of effective traffic signal control (TSC) systems. While existing Reinforcement Learning (RL)-based methods have shown promising performance in optimizing TSC, it is challenging to generalize these methods across intersections of different structures. In this work, a universal RL-based TSC framework is proposed for Vehicle-to-Everything (V2X) environments. The proposed framework introduces a novel agent design that incorporates a junction matrix to characterize intersection states, making the proposed model applicable to diverse intersections. To equip the proposed RL-based framework with enhanced capability of handling various intersection structures, novel traffic state augmentation methods are tailor-made for signal light control systems. Finally, extensive experimental results derived from multiple intersection configurations confirm the effectiveness of the proposed framework. The source code in this work is available at <a class="link-external link-https" href="https://github.com/wmn7/Universal_Light" rel="external noopener nofollow">this https URL</a>
Systems and Control,Machine Learning
What problem does this paper attempt to address?
### Problems the paper attempts to solve This paper aims to solve the urban traffic congestion problem, especially how to optimize traffic flow through an effective traffic signal control (TSC) system. Although existing reinforcement learning (RL) - based methods perform well in optimizing TSC, these methods are difficult to generalize among intersections with different structures. Therefore, the paper proposes a general - purpose reinforcement learning framework (UniTSA) for traffic signal control in the vehicle - to - everything (V2X) environment. By introducing a new agent design and a traffic state enhancement method, this framework enables the model to be applicable to multiple intersections with different structures and perform well in unseen intersection configurations. ### Main contributions 1. **General - purpose TSC framework**: Proposed an adaptive TSC framework named UniTSA, which utilizes a general - purpose reinforcement learning model and a novel agent design and can handle multiple intersections with different structures. In addition, a fine - tuning mechanism was designed to further improve the performance of key intersections. 2. **Traffic state enhancement method**: Developed five traffic state enhancement methods, which enhanced the agent's understanding of different intersections and improved the performance in the training set and the test set. 3. **Experimental verification**: Through extensive experiments on 12 intersections with different structures, it was proved that the proposed UniTSA model significantly outperforms traditional general - purpose models in unseen intersections. Moreover, for new intersections, UniTSA can achieve comparable or better performance with significantly reduced training time by fine - tuning the pre - trained model. ### Method overview 1. **Framework design**: - **General - purpose agent design**: By introducing a junction matrix representing the intersection state, the model can handle intersections with different structures. - **Action design**: Adopted an "hold or change the current phase" action design to ensure that the model structure remains consistent in intersections with different configurations. - **Reward function**: Used the negative value of the average queue length of each movement as a reward to promote more efficient traffic flow. 2. **Traffic state enhancement**: - **Movement shuffling**: By randomly swapping the rows of the junction matrix to simulate different rotations and flips, the agent can adapt to different views of the same intersection. - **Lane number change**: Randomly modify the number of lanes in each movement vector, enabling the agent to handle various lane configurations. - **Traffic flow scaling**: Adjust the flow and occupancy rate by multiplying by a uniformly distributed random number, making the agent focus on the relative vehicle distribution rather than the absolute number. - **Gaussian noise addition**: Add Gaussian noise to the junction matrix, enabling the agent to adapt to noisy and uncertain traffic conditions. - **Masking**: Randomly set some values in the junction matrix to zero, prompting the agent to rely on the information before and after masking to infer traffic dynamics. 3. **Intersection feature extraction**: - **CNN structure**: Use two 2D convolutional layers to extract the time - series information of road intersections and generate a hidden representation. - **RNN structure**: Extract the information of each junction matrix through a 1D convolutional layer with parameter sharing, and then capture the time - dependence through an RNN module. - **Transformer structure**: Use a weight - sharing CNN network to extract the features of each time step, and then capture the time - dependence of the feature sequence through a Transformer encoder. ### Conclusion The UniTSA framework effectively solves the generalization problem of existing RL methods among intersections with different structures by introducing a general - purpose agent design and a traffic state enhancement method. The experimental results show that UniTSA performs well in unseen intersections and can quickly achieve good performance on new intersections through fine - tuning.