Abstract:Reinforcement learning (RL) is a promising way to achieve human-like autonomous driving (HAD) in complex and dynamic traffic, but faces challenges such as low sample efficiency, partial observability, and sim2real transfer. In light of this, a comprehensive solution for RL-driven HAD is established. First, an efficient training scheme called Deep Recurrent Q-learning from demonstration algorithm (DRQfD) is proposed for lane-changing decision-making to address the low sample efficiency in RL and the poor generalization capability in Imitation Learning (IL). The inherent LSTM structure potentially learns to predict future states of surrounding vehicles, helping to address the partially observable problem in autonomous driving (AD). Second, to reduce the sim2real gap, a twin high-fidelity simulator is built based on ROS-Gazebo for simulating LiDAR sensing, model training, and evaluations. Domain randomization is used to improve the robustness and generalization ability, making it easier for the model to be transferred to real-world scenarios. In addition, for the multi-objective optimization and imbalanced data issues in this scenario, a hierarchical decision-making framework is proposed to decompose the complex decision-making problem into several subtasks, making the driving policies easier to converge. To avoid the excessive dependence of the decision-making module on the output of perception module in modular systems, we train each modularized skill in an end-to-end manner. Moreover, we compare our method with a vanilla RL method to show improvement in learning efficiency. Comparisons between RL-based model and IL baseline in terms of safety, travel efficiency, and human-likeness are also given. To further validate the generalization ability of our model, we test the model on real traffic dataset. Finally, we implement the RL model on physical cars to demonstrate the performance of sim2real transfer.

Efficient Learning of Urban Driving Policies Using Bird's-Eye-View State Representations

Learning an Efficient and Safe Policy for Highway Driving Using Supervised Learning and Reinforcement Learning.

Efficient Latent Representations using Multiple Tasks for Autonomous Driving

Towards Learning Generalizable Driving Policies from Restricted Latent Representations

Policy-Based Reinforcement Learning for Training Autonomous Driving Agents in Urban Areas With Affordance Learning

Model-free Deep Reinforcement Learning for Urban Autonomous Driving

Self-Learned Autonomous Driving at Unsignalized Intersections: A Hierarchical Reinforced Learning Approach for Feasible Decision-Making

Towards Robust Decision-Making for Autonomous Highway Driving Based on Safe Reinforcement Learning

Deep reinforcement learning for autonomous driving in uncontrolled intersections of Indian roads

Autonomous Highway Driving using Deep Reinforcement Learning

CADRE: A Cascade Deep Reinforcement Learning Framework for Vision-based Autonomous Urban Driving

End-to-End Urban Driving by Imitating a Reinforcement Learning Coach

Urban Driving with Multi-Objective Deep Reinforcement Learning

Improved Deep Reinforcement Learning with Expert Demonstrations for Urban Autonomous Driving

Multi-Vehicle Mixed-Reality Reinforcement Learning for Autonomous Multi-Lane Driving

Iterative Imitation Policy Improvement for Interactive Autonomous Driving

From Naturalistic Traffic Data to Learning-Based Driving Policy: A Sim-to-Real Study

Spatially and Seamlessly Hierarchical Reinforcement Learning for State Space and Policy space in Autonomous Driving

Urban Driver: Learning to Drive from Real-world Demonstrations Using Policy Gradients

Conditional Affordance Learning for Driving in Urban Environments

Learning predictive representations in autonomous driving to improve deep reinforcement learning