GITSR: Graph Interaction Transformer-based Scene Representation for Multi Vehicle Collaborative Decision-making

Xingyu Hu,Lijun Zhang,Dejian Meng,Ye Han,Lisha Yuan
DOI: https://doi.org/10.48550/arXiv.2411.01608
2024-11-03
Abstract:In this study, we propose GITSR, an effective framework for Graph Interaction Transformer-based Scene Representation for multi-vehicle collaborative decision-making in intelligent transportation system. In the context of mixed traffic where Connected Automated Vehicles (CAVs) and Human Driving Vehicles (HDVs) coexist, in order to enhance the understanding of the environment by CAVs to improve decision-making capabilities, this framework focuses on efficient scene representation and the modeling of spatial interaction behaviors of traffic states. We first extract features of the driving environment based on the background of intelligent networking. Subsequently, the local scene representation, which is based on the agent-centric and dynamic occupation grid, is calculated by the Transformer module. Besides, feasible region of the map is captured through the multi-head attention mechanism to reduce the collision of vehicles. Notably, spatial interaction behaviors, based on motion information, are modeled as graph structures and extracted via Graph Neural Network (GNN). Ultimately, the collaborative decision-making among multiple vehicles is formulated as a Markov Decision Process (MDP), with driving actions output by Reinforcement Learning (RL) algorithms. Our algorithmic validation is executed within the extremely challenging scenario of highway off-ramp task, thereby substantiating the superiority of agent-centric approach to scene representation. Simulation results demonstrate that the GITSR method can not only effectively capture scene representation but also extract spatial interaction data, outperforming the baseline method across various comparative metrics.
Machine Learning,Artificial Intelligence,Computer Vision and Pattern Recognition,Multiagent Systems,Robotics
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to achieve multi - vehicle collaborative decision - making in a mixed traffic environment. In particular, in the case of the co - existence of connected autonomous vehicles (CAVs) and human - driven vehicles (HDVs), how to improve the CAVs' ability to understand the environment so as to enhance their decision - making ability. Specifically, the paper focuses on effective scene representation and spatial interaction behavior modeling of traffic states. By proposing the GITSR framework, the paper aims to address the following challenges: 1. **Efficient Scene Representation**: In a dynamic traffic environment, effectively capturing and representing traffic scenes is crucial for the decision - making of autonomous vehicles. The paper proposes a local scene representation method based on the Transformer module, which can process information according to the vehicle - centered dynamic occupancy grid and enhance the understanding of the surrounding traffic environment. 2. **Spatial Interaction Behavior Modeling**: The real - time interactions among traffic participants significantly affect the interpretation of traffic scenes and the decision - making behavior output of autonomous vehicles. The paper models motion information as a spatial interaction graph through a graph neural network (GNN), thereby extracting the spatial interaction behaviors of traffic participants. 3. **Multi - vehicle Collaborative Decision - making**: The paper formulates the multi - vehicle collaborative decision - making problem as a Markov decision process (MDP) and outputs driving actions through a reinforcement learning (RL) algorithm, aiming to optimize the collaborative decision - making process among multiple CAVs and improve safety, efficiency and task success rate. In summary, the main objective of the paper is to solve the key problems of multi - vehicle collaborative decision - making in a complex and dynamic traffic environment through the design and implementation of the GITSR framework, especially how to effectively represent scene information and model the spatial interaction behaviors of traffic participants, thereby enhancing the decision - making performance of autonomous vehicles in a mixed traffic environment.