Abstract:This article proposes a novel approach to traffic signal control that combines phase re-service with reinforcement learning (RL). The RL agent directly determines the duration of the next phase in a pre-defined sequence. Before the RL agent's decision is executed, we use the shock wave theory to estimate queue expansion at the designated movement allowed for re-service and decide if phase re-service is necessary. If necessary, a temporary phase re-service is inserted before the next regular phase. We formulate the RL problem as a semi-Markov decision process (SMDP) and solve it with proximal policy optimization (PPO). We conducted a series of experiments that showed significant improvements thanks to the introduction of phase re-service. Vehicle delays are reduced by up to 29.95% of the average and up to 59.21% of the standard deviation. The number of stops is reduced by 26.05% on average with 45.77% less standard deviation.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is to reduce vehicle delays and the number of stops at intersections by enhancing adaptive traffic signal control (ATSC) based on reinforcement learning (RL) in high - traffic left - turn demand scenarios. Specifically, the author proposes a method that combines phase re - service with reinforcement learning to manage traffic flow more flexibly, especially when the demand for left - turn lanes surges during peak hours, which can effectively relieve traffic congestion and reduce vehicle waiting time. ### Main contributions of the paper 1. **Introduction of phase re - service**: A new mechanism, phase re - service, is added to traditional reinforcement - learning - based traffic signal control, that is, a specific phase (such as protected left - turn) is repeatedly served within a cycle. This helps to clear the left - turn queue more effectively, especially in high - demand situations. 2. **Estimation of queue growth using shock wave theory**: To determine whether phase re - service is required, the author uses shock wave theory to estimate the queue expansion in a specified movement direction. If the estimation result shows that the queue growth exceeds the preset threshold, a temporary phase re - service will be inserted. 3. **Semi - Markov decision process (SMDP) modeling**: Since the RL agent selects the duration of each phase, the author models the control problem as a semi - Markov decision process (SMDP) and solves it through the proximal policy optimization (PPO) algorithm. ### Experimental results The experimental results show that this method can significantly reduce vehicle delays and the number of stops in multiple traffic demand scenarios. The specific data are as follows: - **Average vehicle delay**: Reduced by up to 29.95% at most, and the standard deviation is reduced by up to 59.21% at most. - **Average number of stops**: Reduced by up to 26.05% at most, and the standard deviation is reduced by up to 45.77% at most. These improvements are not only reflected in the overall performance but are also particularly significant for protected left - turn movements. ### Conclusion The method proposed in the paper effectively improves the flexibility and efficiency of traffic signal control by combining reinforcement learning and phase re - service, especially when dealing with high - traffic left - turn demands. The experimental results verify the effectiveness of this method and provide new ideas and technical support for future traffic signal control systems.

Phase Re-service in Reinforcement Learning Traffic Signal Control

Uniformity of Markov Elements in Deep Reinforcement Learning for Traffic Signal Control

Learning Phase Competition for Traffic Signal Control

Reinforcement Learning for Traffic Signal Control in Hybrid Action Space

A Deep Reinforcement Learning Approach to Traffic Signal Control With Temporal Traffic Pattern Mining

A Deep Reinforcement Learning Approach for Isolated Intersection Traffic Signal Control with Long-Short Term Memory Network

A Deep Reinforcement Learning Approach for Traffic Signal Control Optimization

A multi‐agent deep reinforcement learning approach for traffic signal coordination

Traffic Signal Control Using Hybrid Action Space Deep Reinforcement Learning

A Reinforcement Learning Approach for Intelligent Traffic Signal Control at Urban Intersections

Traffic Signal Timing via Parallel Reinforcement Learning

Traffic light control with reinforcement learning

Adaptive Coordination Offsets for Signalized Arterial Intersections using Deep Reinforcement Learning

Training Reinforcement Learning Agent for Traffic Signal Control under Different Traffic Conditions

Learning in practice: reinforcement learning-based traffic signal control augmented with actuated control

First steps towards real-world traffic signal control optimisation by reinforcement learning

Reinforcement Learning Approaches for Traffic Signal Control under Missing Data

PhaseLight: an Universal and Practical Traffic Signal Control Algorithms Based on Reinforcement Learning

Deep reinforcement learning for traffic signal control with consistent state and reward design approach

DynamicLight: Dynamically Tuning Traffic Signal Duration with DRL