Abstract:In recent years, reinforcement learning (RL)-based methods for learning driving policies have gained increasing attention in the autonomous driving community and have achieved remarkable progress in various driving scenarios. However, traditional RL approaches rely on manually engineered rewards, which require extensive human effort and often lack generalizability. To address these limitations, we propose \textbf{VLM-RL}, a unified framework that integrates pre-trained Vision-Language Models (VLMs) with RL to generate reward signals using image observation and natural language goals. The core of VLM-RL is the contrasting language goal (CLG)-as-reward paradigm, which uses positive and negative language goals to generate semantic rewards. We further introduce a hierarchical reward synthesis approach that combines CLG-based semantic rewards with vehicle state information, improving reward stability and offering a more comprehensive reward signal. Additionally, a batch-processing technique is employed to optimize computational efficiency during training. Extensive experiments in the CARLA simulator demonstrate that VLM-RL outperforms state-of-the-art baselines, achieving a 10.5\% reduction in collision rate, a 104.6\% increase in route completion rate, and robust generalization to unseen driving scenarios. Furthermore, VLM-RL can seamlessly integrate almost any standard RL algorithms, potentially revolutionizing the existing RL paradigm that relies on manual reward engineering and enabling continuous performance improvements. The demo video and code can be accessed at: <a class="link-external link-https" href="https://zilin-huang.github.io/VLM-RL-website" rel="external noopener nofollow">this https URL</a>.

Human-centric Reward Optimization for Reinforcement Learning-based Automated Driving using Large Language Models

Generating and Evolving Reward Functions for Highway Driving with Large Language Models

LORD: Large Models based Opposite Reward Design for Autonomous Driving

Drive Like a Human: Rethinking Autonomous Driving with Large Language Models

VLM-RL: A Unified Vision Language Models and Reinforcement Learning Framework for Safe Autonomous Driving

Receive, Reason, and React: Drive as You Say, With Large Language Models in Autonomous Vehicles

Receive, Reason, and React: Drive as You Say with Large Language Models in Autonomous Vehicles

Optimizing Autonomous Driving for Safety: A Human-Centric Approach with LLM-Enhanced RLHF

Large Language Model guided Deep Reinforcement Learning for Decision Making in Autonomous Driving

Towards Socially and Morally Aware RL agent: Reward Design With LLM

Prompting Multi-Modal Tokens to Enhance End-to-End Autonomous Driving Imitation Learning with LLMs

A Large Language Model-Driven Reward Design Framework via Dynamic Feedback for Reinforcement Learning

HGRL: Human-Driving-Data Guided Reinforcement Learning for Autonomous Driving

Self-Refined Large Language Model as Automated Reward Function Designer for Deep Reinforcement Learning in Robotics

LLM4RL: Enhancing Reinforcement Learning with Large Language Models

REvolve: Reward Evolution with Large Language Models using Human Feedback

Personalized Autonomous Driving with Large Language Models: Field Experiments

LanguageMPC: Large Language Models as Decision Makers for Autonomous Driving

DriveMLM: Aligning Multi-Modal Large Language Models with Behavioral Planning States for Autonomous Driving

Empowering Autonomous Driving with Large Language Models: A Safety Perspective

Drive as You Speak: Enabling Human-Like Interaction with Large Language Models in Autonomous Vehicles