OPTIMA: Optimized Policy for Intelligent Multi-Agent Systems Enables Coordination-Aware Autonomous Vehicles

Rui Du,Kai Zhao,Jinlong Hou,Qiang Zhang,Peter Zhang
2024-10-09
Abstract:Coordination among connected and autonomous vehicles (CAVs) is advancing due to developments in control and communication technologies. However, much of the current work is based on oversimplified and unrealistic task-specific assumptions, which may introduce vulnerabilities. This is critical because CAVs not only interact with their environment but are also integral parts of it. Insufficient exploration can result in policies that carry latent risks, highlighting the need for methods that explore the environment both extensively and efficiently. This work introduces OPTIMA, a novel distributed reinforcement learning framework for cooperative autonomous vehicle tasks. OPTIMA alternates between thorough data sampling from environmental interactions and multi-agent reinforcement learning algorithms to optimize CAV cooperation, emphasizing both safety and efficiency. Our goal is to improve the generality and performance of CAVs in highly complex and crowded scenarios. Furthermore, the industrial-scale distributed training system easily adapts to different algorithms, reward functions, and strategies.
Multiagent Systems,Machine Learning,Robotics
What problem does this paper attempt to address?
The problem this paper attempts to address is: The current coordination between connected and autonomous vehicles (CAVs) largely relies on overly simplified and unrealistic task-specific assumptions, which may lead to potential safety risks. CAVs not only interact with the environment but are also an integral part of it, meaning their behaviors influence each other, forming complex feedback loops. If CAVs are not exposed to a wide variety of scenarios during training, they may fail to handle the diversity encountered in real-world driving. When faced with unfamiliar situations, CAVs may exhibit unexpected behaviors, triggering a chain reaction that could put other vehicles in jeopardy. To address these issues, the paper proposes OPTIMA, a new distributed reinforcement learning framework for cooperative autonomous vehicle tasks. OPTIMA optimizes CAV cooperation by alternating between deep environmental data sampling and multi-agent reinforcement learning algorithms, emphasizing safety and efficiency. Its goal is to enhance the generality and performance of CAVs in highly complex and congested situations, and the industrial-scale distributed training system can easily adapt to different algorithms, reward functions, and strategies.