Collaborative Control Method of Transit Signal Priority Based on Cooperative Game and Reinforcement Learning

Hao Qin,Weishi Zhang
2024-05-01
Abstract:To address the low efficiency in priority signal control within intelligent transportation systems, this study introduces a novel eight-phase priority signal control method, CBQL-TSP, leveraging a hybrid decision-making framework that integrates cooperative game theory and reinforcement learning. This approach conceptualizes the allocation of bus signal priorities as a multi-objective decision-making problem across an eight-phase signal sequence, differentiating between priority and non-priority phases. It employs a cooperative game model to facilitate this differentiation. The developed hybrid decision-making algorithm, CBQL, effectively tackles the multi-objective decision-making challenges inherent in the eight-phase signal sequence. By computing the Shapley value function, it quantifies the marginal contributions of each participant, which in turn inform the construction of a state transition probability equation based on Shapley value ratios. Compared to conventional control methods, the CBQL-TSP method not only upholds the fairness principles of cooperative game theory but also harnesses the adaptive learning capabilities of Q-Learning. This enables dynamic adjustments to signal timing in response to real-time traffic conditions, significantly enhancing the flexibility and efficiency of priority signal control.
Computer Science and Game Theory
What problem does this paper attempt to address?
This paper aims to address the problem of inefficiency in priority signal control in intelligent transportation systems. A new eight-stage priority signal control method, called CBQL-TSP, is proposed in this study, which combines cooperative game theory and reinforcement learning in a hybrid decision-making framework. This method treats bus signal prioritization as a multi-objective decision-making problem, distinguishing between priority and non-priority stages, and utilizing a cooperative game model for differentiation. By calculating the Shapley value function to quantify the marginal contributions of each participant, a state transition probability equation based on Shapley value ratios is constructed. Compared to traditional control methods, CBQL-TSP not only follows the fairness principle of cooperative games but also utilizes the adaptive learning ability of Q-learning to dynamically adjust signal timing in response to real-time traffic conditions, thus significantly improving the flexibility and efficiency of priority signal control. The effectiveness of the new method in optimizing traffic flow, improving bus service reliability, and reducing traffic congestion is verified through simulation experiments in the paper.