Autonomous Algorithm for Training Autonomous Vehicles with Minimal Human Intervention

Sang-Hyun Lee,Daehyeok Kwon,Seung-Woo Seo
2024-05-22
Abstract:Reinforcement learning (RL) provides a compelling framework for enabling autonomous vehicles to continue to learn and improve diverse driving behaviors on their own. However, training real-world autonomous vehicles with current RL algorithms presents several challenges. One critical challenge, often overlooked in these algorithms, is the need to reset a driving environment between every episode. While resetting an environment after each episode is trivial in simulated settings, it demands significant human intervention in the real world. In this paper, we introduce a novel autonomous algorithm that allows off-the-shelf RL algorithms to train an autonomous vehicle with minimal human intervention. Our algorithm takes into account the learning progress of the autonomous vehicle to determine when to abort episodes before it enters unsafe states and where to reset it for subsequent episodes in order to gather informative transitions. The learning progress is estimated based on the novelty of both current and future states. We also take advantage of rule-based autonomous driving algorithms to safely reset an autonomous vehicle to an initial state. We evaluate our algorithm against baselines on diverse urban driving tasks. The experimental results show that our algorithm is task-agnostic and achieves better driving performance with fewer manual resets than baselines.
Robotics,Machine Learning
What problem does this paper attempt to address?
The paper aims to address the challenges faced in training autonomous vehicles using reinforcement learning (RL) in the real world, particularly the problem of environment reset without extensive human intervention. Current RL algorithms can easily reset the environment in simulated environments, but in the real world, this requires a significant amount of human involvement to prevent the vehicle from entering unsafe states and resetting it to the initial position. The paper proposes a novel autonomous algorithm that allows existing RL algorithms to train autonomous vehicles with minimal human intervention. This algorithm considers the learning progress of the vehicle, predicts when to terminate the driving task to avoid unsafe states, and determines where to reset the vehicle to obtain valuable learning transitions. The learning progress is estimated based on the novelty of the current and future states, with novelty being higher in unseen states. Additionally, the paper utilizes rule-based autonomous driving algorithms to safely reset the vehicle to its initial state. Experimental results show that compared to baseline methods, this algorithm performs better on various city driving tasks, requiring fewer manual resets and improving sample efficiency. The algorithm is applicable to different driving scenarios and compatible with any RL algorithm, while introducing a new approach that utilizes rule-based algorithms to assist RL training of autonomous vehicles.