Optimizing Autonomous Vehicle Navigation with DQN and PPO: A Reinforcement Learning Approach

Rishabh Sharma,Prateek Garg
DOI: https://doi.org/10.1109/APCIT62007.2024.10673440
2024-07-26
Abstract:The fast-paced development of autopilot software technology calls for the design of smart and efficient algorithms, whose responsibility is to navigate not-so-easy and unpredictable situations. This study will examine the practicality of two common Alpha-Zero techniques from the field of reinforcement learning for automated navigation vehicles, namely, the Deep Q-network (DQN) and the Proximal Policy Optimization (PPO). Utilizing a low-fidelity driving simulator a high-fidelity traffic simulator was then used. The training and testing of both algorithms were then carried out over multiple driving scenarios to evaluate their effectiveness under different scenarios. The results indicate that both DQN and PPO are better than traditional models in performance, with PPO exhibiting greater effectiveness in terms of maintaining pace and navigation efficiency, and reduced walking time. This success is exampled by PPO which has a recorded 95% completion rate and 83% efficiency that even goes beyond 89% completion rate and 78% efficiency rated by DQN. Encouragingly, this result points to the future of sophisticated reinforcement learning as one of prospective safety improvement and efficiency enhancement in the autonomous vehicles sector. Furthermore, the research identifies the role of training parameters in the final outlook of these models such that the information is crucial for tuning self-navigation systems, something that is highly crucial for these systems. The study not only extends the scientific knowledge on exploitation of the reinforcement learning algorithms in the autonomous systems but also proposes aspects for realistic implementation of them, which as one of the results emphasizes the need for further research to completely get the study concept realized in the real world.
Computer Science,Engineering
What problem does this paper attempt to address?