Communication-Aware Reinforcement Learning for Cooperative Adaptive Cruise Control

Sicong Jiang,Seongjin Choi,Lijun Sun
2024-07-12
Abstract:Cooperative Adaptive Cruise Control (CACC) plays a pivotal role in enhancing traffic efficiency and safety in Connected and Autonomous Vehicles (CAVs). Reinforcement Learning (RL) has proven effective in optimizing complex decision-making processes in CACC, leading to improved system performance and adaptability. Among RL approaches, Multi-Agent Reinforcement Learning (MARL) has shown remarkable potential by enabling coordinated actions among multiple CAVs through Centralized Training with Decentralized Execution (CTDE). However, MARL often faces scalability issues, particularly when CACC vehicles suddenly join or leave the platoon, resulting in performance degradation. To address these challenges, we propose Communication-Aware Reinforcement Learning (CA-RL). CA-RL includes a communication-aware module that extracts and compresses vehicle communication information through forward and backward information transmission modules. This enables efficient cyclic information propagation within the CACC traffic flow, ensuring policy consistency and mitigating the scalability problems of MARL in CACC. Experimental results demonstrate that CA-RL significantly outperforms baseline methods in various traffic scenarios, achieving superior scalability, robustness, and overall system performance while maintaining reliable performance despite changes in the number of participating vehicles.
Machine Learning,Robotics
What problem does this paper attempt to address?
This paper aims to solve the performance and scalability issues of the Cooperative Adaptive Cruise Control (CACC) system in Connected and Autonomous Vehicles (CAVs). Specifically, the paper focuses on the following points: 1. **Scalability issues of Multi - Agent Reinforcement Learning (MARL)**: When CACC vehicles suddenly join or leave the platoon, the performance of the MARL system will decline, which affects the reliability and safety of the system. 2. **Policy consistency and full utilization of traffic flow information**: Traditional Single - Agent Reinforcement Learning (SARL) can ensure policy consistency, but lacks the utilization of the entire traffic flow information; while Multi - Agent Reinforcement Learning (MARL) can fully utilize traffic flow information, but has challenges in maintaining policy consistency among different vehicles. To solve the above problems, the paper proposes a new framework - Communication - Aware Reinforcement Learning (CA - RL). This framework introduces a communication - aware module to extract and compress vehicle communication information, achieving efficient cyclic information propagation, thereby ensuring policy consistency and alleviating the scalability issues of MARL in CACC. Experimental results show that CA - RL significantly outperforms baseline methods in various traffic scenarios, showing better scalability, robustness and overall system performance. ### Main contributions of the paper: 1. **Developed the Communication - Aware Reinforcement Learning (CA - RL) framework**, redesigned the vehicle - to - vehicle (V2V) - based communication architecture, and enhanced the adaptability and efficiency of RL in complex traffic environments. 2. **Introduced a flexible inter - vehicle information transmission mechanism**, which is compatible with multiple RL algorithms, making it more widely applicable in different CACC systems. 3. **Integrated the advantages of single - agent and multi - agent reinforcement learning**, considered policy consistency and full utilization of traffic flow information simultaneously, significantly improved the generalization ability of CA - RL, and ensured stable performance in various traffic scenarios. ### Methodology: - **Problem modeling**: Formalize the problem as a Markov Decision Process (MDP), and define the state space, action space and reward function. - **Communication - aware module**: Design a unique communication structure, process the received information through forward and backward information transmission networks, extract high - dimensional features, and enhance the decision - making ability and overall performance of the CACC system. - **Actor - Critic network implementation**: Combine the communication - aware module and the Actor - Critic network to achieve efficient policy update and value evaluation. ### Experimental setup: - **Dataset**: Use the freeway trajectory data in the NGSIM dataset for training and testing. - **Simulation environment**: In each simulation training iteration, randomly select a vehicle trajectory as the leading vehicle, and then there are NCACC following vehicles. Determine the acceleration of each vehicle through the baseline longitudinal control model and the RL model, and update its speed and position. In conclusion, this paper solves the scalability and policy consistency issues of the CACC system in multi - agent reinforcement learning by proposing the CA - RL framework, providing a new solution for improving traffic efficiency and safety.