Attentive Hybrid Reinforcement Learning-Based Eco-Driving Strategy for Connected Vehicles with Hybrid Action Spaces and Surrounding Vehicles Attention

Menglin Li,Xiangqi Wan,Mei Yan,Jingda Wu,Hongwen He
DOI: https://doi.org/10.1016/j.enconman.2024.119059
IF: 10.4
2024-01-01
Energy Conversion and Management
Abstract:In environments characterized by complex multi-source traffic information, the interaction between the ego vehicle and surrounding vehicles, along with behavioral interference, present fundamental challenges to the stability of eco-driving strategies. Therefore, an integrated action eco-driving strategy is proposed based on attention mechanism to explore strategy differences under varying interaction relationship with surrounding vehicles. Specifically, within the eco-driving strategy framework of the twin delayed deep deterministic policy gradient (TD3) algorithm, a multi-head self-attention module is introduced for feature extraction from traffic flow information. A hybrid action representation mechanism is assimilated to cooperate longitudinal acceleration control and lateral lane-change decisions, integrating attention output features to stabilize strategy shifts. Results indicate that the proposed strategy outperforms parameterized action deep deterministic policy gradient strategies (PADDPG) and discrete attention-based deep Q network eco-driving strategies (Attentive-DQN) under adversarial reward functions, achieving a 42.18% stability improvement over PADDPG and an 84.79% time stability improvement over Attentive-DQN. Compared to vehicles using the Krauss car-following model and LC2013 lane-changing model, the proposed strategy achieves a 30.25% energy optimization with only a 1.01% loss in time efficiency. Additionally, as behavioral interference factors are emphasized in strategy objectives, the parameter distribution of the attention module changes to enhance the ego vehicle’s interaction with interfering objects. This indicates that considering surrounding vehicle dynamics in strategy execution shapes different interaction relationships.
What problem does this paper attempt to address?