Vision Language Models in Autonomous Driving and Intelligent Transportation Systems

Xingcheng Zhou,Mingyu Liu,Bare Luka Zagar,Ekim Yurtsever,Alois C. Knoll
DOI: https://doi.org/10.48550/arXiv.2310.14414
2023-10-22
Computer Vision and Pattern Recognition
Abstract:The applications of Vision-Language Models (VLMs) in the fields of Autonomous Driving (AD) and Intelligent Transportation Systems (ITS) have attracted widespread attention due to their outstanding performance and the ability to leverage Large Language Models (LLMs). By integrating language data, the vehicles, and transportation systems are able to deeply understand real-world environments, improving driving safety and efficiency. In this work, we present a comprehensive survey of the advances in language models in this domain, encompassing current models and datasets. Additionally, we explore the potential applications and emerging research directions. Finally, we thoroughly discuss the challenges and research gap. The paper aims to provide researchers with the current work and future trends of VLMs in AD and ITS.
What problem does this paper attempt to address?