Dynamic Game-Theoretical Decision-Making Framework for Vehicle-Pedestrian Interaction with Human Bounded Rationality

Meiting Dang,Dezong Zhao,Yafei Wang,Chongfeng Wei
2024-09-24
Abstract:Human-involved interactive environments pose significant challenges for autonomous vehicle decision-making processes due to the complexity and uncertainty of human behavior. It is crucial to develop an explainable and trustworthy decision-making system for autonomous vehicles interacting with pedestrians. Previous studies often used traditional game theory to describe interactions for its interpretability. However, it assumes complete human rationality and unlimited reasoning abilities, which is unrealistic. To solve this limitation and improve model accuracy, this paper proposes a novel framework that integrates the partially observable markov decision process with behavioral game theory to dynamically model AV-pedestrian interactions at the unsignalized intersection. Both the AV and the pedestrian are modeled as dynamic-belief-induced quantal cognitive hierarchy (DB-QCH) models, considering human reasoning limitations and bounded rationality in the decision-making process. In addition, a dynamic belief updating mechanism allows the AV to update its understanding of the opponent's rationality degree in real-time based on observed behaviors and adapt its strategies accordingly. The analysis results indicate that our models effectively simulate vehicle-pedestrian interactions and our proposed AV decision-making approach performs well in safety, efficiency, and smoothness. It closely resembles real-world driving behavior and even achieves more comfortable driving navigation compared to our previous virtual reality experimental data.
Robotics
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the interactive decision - making problem between autonomous vehicles (AVs) and pedestrians at unsignalized intersections. Specifically, the paper focuses on how to develop an interpretable and reliable decision - making system in complex and dynamic environments, taking into account the uncertainty and bounded rationality of human behavior, in order to improve the safety, efficiency and smoothness of autonomous vehicles when interacting with pedestrians. ### Main contributions of the paper 1. **Integrating the POMDP framework and behavioral game theory**: In order to deal with the uncertainty and dynamic interaction between autonomous vehicles and pedestrians, the paper proposes a new framework that combines the Partially Observable Markov Decision Process (POMDP) and behavioral game theory. 2. **Dynamic - Belief - Induced Quantal Cognitive Hierarchy Model (DB - QCH)**: In the paper, both autonomous vehicles and pedestrians are modeled as the Dynamic - Belief - Induced Quantal Cognitive Hierarchy Model. This model can comprehensively describe the interaction dynamics and promote more realistic simulations. 3. **Neural - network - based Monte Carlo Tree Search (MCTS) guidance**: The paper develops a neural network trained on previous experimental data to guide the exploration of MCTS in the continuous action space, thereby achieving effective and efficient decision - making. 4. **Dynamic update mechanism**: For the first time, the paper introduces variables to quantify human bounded rationality and proposes a dynamic update mechanism based on the observed environment, enabling autonomous vehicles to make adaptive decisions in real - time environments. ### Core technologies of the solution - **POMDP framework**: Used for dynamically modeling the decision - making process of autonomous vehicles in environments with incomplete information and uncertainty. - **Behavioral game theory**: In particular, the Quantal Cognitive Hierarchy Model (QCH), which is used to describe the decision - making processes of autonomous vehicles and pedestrians, taking into account human bounded rationality and different cognitive levels. - **Combination of neural network and MCTS**: By using a pre - trained neural network to guide the exploration of MCTS in the continuous action space, the efficiency and accuracy of decision - making are improved. - **Dynamic belief update**: Using the Bayesian method to dynamically update beliefs about the opponent's rationality and cognitive level, making the model more adaptable to changes in the actual environment. ### Formula analysis 1. **Quantal response function**: \[ P(a_i)=\frac{e^{\lambda Q(a_i, a_{-i})}}{\sum_{a'_i\in A}e^{\lambda Q(a'_i, a_{-i})}} \] This formula represents the probability that agent \(i\) selects strategy \(a_i\) given the opponent's action \(a_{-i}\). Here, \(\lambda\) is the rationality parameter, and \(Q(a_i, a_{-i})\) is the expected payoff of selecting strategy \(a_i\). 2. **Dynamic belief update**: \[ P(k|s_{t + 1}, a_{-i}^{t+1})=\frac{P(s_{t+1}, a_{-i}^{t+1}|k)b_k^t(k)}{\sum_{k'\in\Theta}P(s_{t+1}, a_{-i}^{t+1}|k')b_k^t(k')} \] This formula represents the agent's belief update of the opponent's cognitive level \(k\) after observing the opponent's action \(a_{-i}^{t+1}\) at time step \(t + 1\). 3. **Bayesian update of continuous variables**: \[ f_{t+1}(\lambda|a_j^t, k)=\frac{P(a_j^t|k, \lambda)f_t(\lambda)}{\int_0^{\infty}P(a_j^t|k, \lambda')f_t(\lambda')d\lambda'} \] This formula represents the agent's belief update of the opponent's rationality \(\lambda\) after observing the opponent's action \(a_j^t\) at time step \(t\). Through these techniques and methods, the paper provides a comprehensive and...