Deep reinforcement learning for autonomous driving in uncontrolled intersections of Indian roads
Aravindh R. Shankar,Ajay Mittur,Adithya Narasimhan,Kayarvizhy N
DOI: https://doi.org/10.1007/s11042-024-19812-6
IF: 2.577
2024-08-18
Multimedia Tools and Applications
Abstract:Unmanned driving agents, trained to identify traffic signals, signboards, and lane change depend on training the agent for various real-life scenarios. This paper addresses uncontrolled intersections with chaotic traffic, a common scenario in the Indian driving context. Employing the Deep Deterministic Policy Gradient (DDPG) actor-critic reinforcement learning algorithm with concurrent learning of the Q value function and policy function, the paper focuses on a multi-actor, single-agent environment at an uncontrolled intersection with bidirectional traffic. The ego-vehicle is assigned checkpoints as destinations, and a reward function evaluates the training method's success, comparing it to other off-policy approaches. Among these, DDPG and the twin-delayed deep deterministic policy gradient (TD3) methods showed the highest reward accumulation over a fixed 1400 training steps. Performance was benchmarked against various off-policy training methods used in contemporary autonomous driving tasks such as Soft Actor-Critic (SAC), Advantage Actor-Critic (A2C) and Importance Weighted Actor-Learner Architecture (IMPALA), as well as with Attention-aware Proximal Policy Optimization (PPO). Metadrive's procedural generation was also utilized to showcase the generalization ability of the respective models. Comparative analysis of driving line trajectories of the methods revealed that DDPG and TD3 consistently outperformed SAC, which exhibited more random navigation. DDPG and TD3 are also shown to generalize well to other scenarios and demonstrate better adaptability to varying traffic densities, number of lanes, and vehicle types.
computer science, information systems, theory & methods,engineering, electrical & electronic, software engineering