Formulation of Deep Reinforcement Learning Architecture Toward Autonomous Driving for On-Ramp Merge

Pin Wang,Ching-Yao Chan
DOI: https://doi.org/10.1109/ITSC.2017.8317735
2019-02-02
Abstract:Multiple automakers have in development or in production automated driving systems (ADS) that offer freeway-pilot functions. This type of ADS is typically limited to restricted-access freeways only, that is, the transition from manual to automated modes takes place only after the ramp merging process is completed manually. One major challenge to extend the automation to ramp merging is that the automated vehicle needs to incorporate and optimize long-term objectives (e.g. successful and smooth merge) when near-term actions must be safely executed. Moreover, the merging process involves interactions with other vehicles whose behaviors are sometimes hard to predict but may influence the merging vehicle optimal actions. To tackle such a complicated control problem, we propose to apply Deep Reinforcement Learning (DRL) techniques for finding an optimal driving policy by maximizing the long-term reward in an interactive environment. Specifically, we apply a Long Short-Term Memory (LSTM) architecture to model the interactive environment, from which an internal state containing historical driving information is conveyed to a Deep Q-Network (DQN). The DQN is used to approximate the Q-function, which takes the internal state as input and generates Q-values as output for action selection. With this DRL architecture, the historical impact of interactive environment on the long-term reward can be captured and taken into account for deciding the optimal control policy. The proposed architecture has the potential to be extended and applied to other autonomous driving scenarios such as driving through a complex intersection or changing lanes under varying traffic flow conditions.
Machine Learning,Artificial Intelligence
What problem does this paper attempt to address?
This paper aims to address the challenges faced by autonomous vehicles when merging lanes at highway on - ramps. Specifically, the issues of concern in the paper include: 1. **Coordination of long - term goals and short - term actions**: The autonomous driving system needs to consider the delayed impact of these actions on the future state of the vehicle when performing immediate operations. For example, taking actions such as accelerating, decelerating or steering at the current moment will affect the vehicle's ability to complete the merging task successfully and smoothly. 2. **Interaction with other vehicles**: The merging process depends not only on the state and actions of the autonomous vehicle itself, but also on the interaction with other surrounding vehicles. These interactions may be cooperative (such as decelerating or changing lanes to allow the merging vehicle to enter the main - line traffic smoothly) or adversarial (such as accelerating to prevent the merging vehicle from entering the main - line). This complex interaction increases the difficulty of implementing a robust merging strategy. To address these challenges, the authors propose a technique based on Deep Reinforcement Learning (DRL) to find the optimal driving strategy and optimize control decisions by maximizing long - term rewards. Specifically, the Long Short - Term Memory (LSTM) architecture is used to model the interactive environment, extract the internal state containing historical driving information from it, and input it into the Deep Q - Network (DQN) for action selection. In this way, the influence of the historical driving environment on long - term rewards can be captured, thereby determining the optimal control strategy. This architecture has the potential for expansion and can be used to handle other autonomous driving scenarios, such as passing through complex intersections or changing lanes under different traffic flow conditions.