Navigation Command Matching for Vision-based Autonomous Driving

Yuxin Pan,Jianru Xue,Pengfei Zhang,Wanli Ouyang,Jianwu Fang,Xingyu Chen
DOI: https://doi.org/10.1109/icra40945.2020.9196609
2020-01-01
Abstract:Learning an optimal policy for autonomous driving task to confront with complex environment is a long-studied challenge. Imitative reinforcement learning is accepted as a promising approach to learn a robust driving policy through expert demonstrations and interactions with environments. However, this model utilizes non-smooth rewards, which have a negative impact on matching between navigation commands and trajectory (state-action pairs), and degrade the generalizability of an agent. Smooth rewards are crucial to discriminate actions generated from sub-optimal policy. In this paper, we propose a navigation command matching (NCM) model to address this issue. There are two key components in NCM, 1) a matching measurer produces smooth navigation rewards that measure matching between navigation commands and trajectory; 2) attention-guided agent performs actions given states where salient regions in RGB images (i.e. roadsides, lane markings and dynamic obstacles) are highlighted to amplify their influence on the final model. We obtain navigation rewards and store transitions to replay buffer after an episode, so NCM is able to discriminate actions generated from suboptimal policy. Experiments on CARLA driving benchmark show our proposed NCM outperforms previous state-of-the-art models on various tasks in terms of the percentage of successfully completed episodes. Moreover, our model improves generalizability of the agent and obtains good performance even in unseen scenarios.
What problem does this paper attempt to address?