Hierarchical Interpretable Imitation Learning for End-to-End Autonomous Driving
Siyu Teng,Long Chen,Yunfeng Ai,Yuanye Zhou,Zhe Xuanyuan,Xuemin Hu
DOI: https://doi.org/10.1109/tiv.2022.3225340
IF: 8.2
2023-01-01
IEEE Transactions on Intelligent Vehicles
Abstract:End-to-end autonomous driving provides a simple and efficient framework for autonomous driving systems, which can directly obtain control commands from raw perception data. However, it fails to address stability and interpretability problems in complex urban scenarios. In this paper, we construct a two-stage end-to-end autonomous driving model for complex urban scenarios, named HIIL (Hierarchical Interpretable Imitation Learning), which integrates interpretable BEV mask and steering angle to solve the problems shown above. In Stage One, we propose a pretrained Bird's Eye View (BEV) model which leverages a BEV mask to present an interpretation of the surrounding environment. In Stage Two, we construct an Interpretable Imitation Learning (IIL) model that fuses BEV latent feature from Stage One with an additional steering angle from Pure-Pursuit (PP) algorithm. In the HIIL model, visual information is converted to semantic images by the semantic segmentation network, and the semantic images are encoded to extract the BEV latent feature, which are decoded to predict BEV masks and fed to the IIL as perception data. In this way, the BEV latent feature bridges the BEV and IIL models. Visual information can be supplemented by the calculated steering angle for PP algorithm, speed vector, and location information, thus it could have better performance in complex and terrible scenarios. Our HIIL model meets an urgent requirement for interpretability and robustness of autonomous driving. We validate the proposed model in the CARLA simulator with extensive experiments which show remarkable interpretability, generalization, and robustness capability in unknown scenarios for navigation tasks.
English Else