Multi-modal policy fusion for end-to-end autonomous driving

Zhenbo Huang,Shiliang Sun,Jing Zhao,Liang Mao
DOI: https://doi.org/10.1016/j.inffus.2023.101834
IF: 18.6
2023-05-20
Information Fusion
Abstract:Multi-modal learning has made impressive progress in autonomous driving by leveraging information from multiple sensors. Existing feature fusion methods make decisions by integrating perceptions from different sensors. However, autonomous driving systems could be risky since the fused feature are unreliable when one of the sensors fails. Moreover, these methods require either sophisticated geometric designs to align features or complex neural networks to effectively fuse features, significantly increasing the training cost. In this paper, we propose PolicyFuser, a policy fusion method for end-to-end autonomous driving to address these issues. PolicyFuser retains an independent decision for each sensor, and no feature alignment or complex neural networks are required. To focus on the best policy, we use reinforcement learning to select the action with the highest Q-value as the primary decision, and the remaining actions as the secondary decisions. Then the secondary decisions are used to fine-tune the primary decision through a primary and secondary policy fusion (PSF) module. To bridge the gap between the decisions from different sensors and improve the stability of policy fusion, we use a conditional variational autoencoder (CVAE) to generate pseudo-expert decisions. We demonstrate the effectiveness of our method in CARLA, and our method achieves the highest driving scores and handles sensor failures with excellence.
computer science, artificial intelligence, theory & methods
What problem does this paper attempt to address?