Stochastic Navigation Command Matching for Imitation Learning of a Driving Policy.

Xiangning Meng,Jianru Xue,Kang Zhao,Gengxin Li,Mengsen Wu
DOI: https://doi.org/10.1007/978-3-031-18913-5_15
2022-01-01
Abstract:Conditional imitation learning provides an efficient framework for autonomous driving, in which a driving policy is learned from human demonstration via mapping from sensor data to vehicle controls, and the navigation command is added to make the driving policy controllable. Navigation command matching is the key to ensuring the controllability of the driving policy model. However, the vehicle control parameters output by the model may not coincide with navigation commands, which means that the model performs incorrect behavior. To address the mismatching problem, we propose a stochastic navigation command matching (SNCM) method. Firstly, we use a multi-branch convolutional neural network to predict actions. Secondly, to generate the probability distributions of actions that are used in SNCM, a memory mechanism is designed. The generated probability distributions are then compared with the prior probability distributions under each navigation command to get matching error. Finally, the loss function weighted by matching and demonstration error is backpropagated to optimize the driving policy model. The significant performance improvement of the proposed method compared with the related works has been verified on the CARLA benchmark.
What problem does this paper attempt to address?