Abstract:Reinforcement learning (RL) has emerged as a promising solution for addressing traffic signal control (TSC) challenges. While most RL-based TSC systems typically employ an online approach, facilitating frequent active interaction with the environment, learning such strategies in the real world is impractical due to safety and risk concerns. To tackle these challenges, this study introduces an innovative offline data-driven approach, called DataLight. DataLight employs effective state representations and reward function by capturing vehicular speed information within the environment. It then segments roads to capture spatial information and further enhances the spatially segmented state representations with sequential modeling. The experimental results demonstrate the effectiveness of DataLight, showcasing superior performance compared to both state-of-the-art online and offline TSC methods. Additionally, DataLight exhibits robust learning capabilities concerning real-world deployment issues. The code is available at <a class="link-external link-https" href="https://github.com/LiangZhang1996/DataLight" rel="external noopener nofollow">this https URL</a>.
What problem does this paper attempt to address?
### What problems does this paper attempt to solve?
This paper aims to address several key challenges in traffic signal control (TSC):
1. **Limitations of online reinforcement learning (Online RL)**:
- **Safety and risk**: Most existing RL - based TSC systems adopt an online approach, that is, they learn strategies by frequently interacting with the environment. However, such learning is impractical in the real world due to safety and risk issues.
- **Data requirements and cost**: Online RL requires a large amount of real - time interaction data, which can be very expensive or difficult to obtain in some environments.
2. **Insufficient application of offline reinforcement learning (Offline RL)**:
- Although Offline RL can use historical data for learning, thus reducing the need for real - time interaction, its application in the TSC field is still relatively scarce. Moreover, Offline RL faces problems such as data distribution shift, which affects the scalability and performance of the model.
3. **Adaptability issues of existing TSC methods**:
- Traditional TSC methods rely on manually - designed traffic signal plans or predefined rules, which limit their flexibility and adaptability in handling diverse traffic conditions.
### Proposed solutions
To solve the above problems, the paper introduces an innovative offline data - driven method - **DataLight**. Specifically, the main contributions of DataLight include:
1. **Offline data - driven RL model**:
- DataLight is an offline RL method that does not require real - time interaction, enhancing its practicality and applicability in various environments.
2. **Innovative state representation and reward function design**:
- DataLight models the spatial state by capturing vehicle speed information and road segments, thereby understanding traffic dynamics more meticulously and providing dynamic control of the overall traffic environment.
3. **Performance surpassing the existing state - of - the - art (SOTA) models**:
- Experimental results show that DataLight significantly outperforms existing online and offline TSC methods on multiple benchmark datasets, becoming a new performance benchmark.
4. **Strong learning ability**:
- DataLight can learn effective strategies from a small amount of offline data and can well handle periodic offline data in the real world, solving problems in practical applications.
Through these improvements, DataLight is expected to improve the efficiency of traffic management systems, optimize traffic flow, reduce congestion, and ultimately enhance the overall performance of the transportation system.