Multi-objective Optimization Based Deep Reinforcement Learning for Autonomous Driving Policy
Hao Wang,Zhongli Wang,Xin Cui
DOI: https://doi.org/10.1088/1742-6596/1861/1/012097
2021-03-01
Journal of Physics: Conference Series
Abstract:Abstract End-to-end autonomous driving approach seeks to solve the problems of perception, decision and control in an integrated way, which can better adapt to the new traffic scene. Due to the diversity of traffic scenes and the uncertainty of the interaction among surrounding vehicles, the design of autonomous driving policy is challenging. Many current methods manually design the corresponding driving policy for different traffic scene, resulting in suboptimal solutions and the maintaining is hard. Most of the existing deep reinforcement learning (DRL) methods can’t work well in the complex urban traffic scenes because of the sensing and simple driving policy. In this paper, to extend the adaptability of the SAC-based method, we proposed to take the multiple sensor data as input, and a VAE network was used to enhance the quality of training data for SAC-based DRL method. A multi-constraints reward function for SAC-based driving policy training is designed, which account for the errors of transverse distance, longitudinal distance, heading, velocity and the possibility of collision. The multiple sensor data include the original RGB image captured by forward-view camera, a 3D LiDAR and the bird’s-eye view map resulted from the perception processing, the mixed inputs could enrich the capability of scene perception. The proposed approach is validated with the multi-vehicle traffic simulation built with CARLA[1]. The results showed that the simulated vehicle could adapt to more challenging traffic scenes, like passing intersections and turning in crowded urban scenes, etc with the driving policy generated by the proposed method, and its performances are obviously outperformed against the existing similar methods.