Distributional Soft Actor-Critic for Decision-Making in On-Ramp Merge Scenarios

Jingliang Duan,Yiting Kong,Chunxuan Jiao,Yang Guan,Shengbo Eben Li,Chen,Bingbing Nie,Keqiang Li
DOI: https://doi.org/10.1007/s42154-023-00260-1
2024-01-01
Automotive Innovation
Abstract:Merging into the highway from the on-ramp is an essential scenario for automated driving. The decision-making in this scenario needs to balance safety and efficiency to optimize a long-term objective, which is challenging due to the dynamic, stochastic, and adversarial characteristics. The existing learning-based methods struggle to meet the safety requirements. This paper proposes a reinforcement-learning-based decision-making method under a framework of offline training and online correction, called the Shielded Distributional Soft Actor-critic (Shielded DSAC). The Shielded DSAC adopts the policy evaluation with safety considerations in offline training, and a safety shield parameterized with the barrier function in online correction. These two measures support each other in achieving better safety without sacrificing efficiency performance. The study verified the Shielded DSAC in a simulated on-ramp merge scenario. The results indicate that the Shielded DSAC has the best safety performance compared to baseline algorithms and achieves efficient driving simultaneously.
What problem does this paper attempt to address?