Formulation and validation of a car-following model based on deep reinforcement learning

Fabian Hart,Ostap Okhrin,Martin Treiber
DOI: https://doi.org/10.48550/arXiv.2109.14268
2021-09-29
Abstract:We propose and validate a novel car following model based on deep reinforcement learning. Our model is trained to maximize externally given reward functions for the free and car-following regimes rather than reproducing existing follower trajectories. The parameters of these reward functions such as desired speed, time gap, or accelerations resemble that of traditional models such as the Intelligent Driver Model (IDM) and allow for explicitly implementing different driving styles. Moreover, they partially lift the black-box nature of conventional neural network models. The model is trained on leading speed profiles governed by a truncated Ornstein-Uhlenbeck process reflecting a realistic leader's kinematics. This allows for arbitrary driving situations and an infinite supply of training data. For various parameterizations of the reward functions, and for a wide variety of artificial and real leader data, the model turned out to be unconditionally string stable, comfortable, and crash-free. String stability has been tested with a platoon of five followers following an artificial and a real leading trajectory. A cross-comparison with the IDM calibrated to the goodness-of-fit of the relative gaps showed a higher reward compared to the traditional model and a better goodness-of-fit.
Machine Learning,Robotics,Signal Processing
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to develop and validate a new car - following model based on deep reinforcement learning. Specifically, this model aims to overcome several key issues in existing car - following models: 1. **Generality and Adaptability**: Traditional models usually perform well only in specific scenarios, while the goal of this research is to develop a model that can perform excellently in various driving situations (including free - driving and car - following situations). 2. **Safety and Comfort**: Existing car - following models may not be able to handle safety - critical situations such as emergency braking very well, or may be lacking in terms of comfort. This research hopes to develop a model that can ensure safety and also provide a comfortable driving experience. 3. **Parameter Adjustment and Driving Style**: By adjusting the parameters of the reward function, this model can simulate different driving styles (for example, aggressive or conservative), and these parameters are similar to those used in the traditional Intelligent Driver Model (IDM), thus partially solving the black - box nature of traditional neural network models. 4. **Data Generation and Generalization Ability**: This model is trained using the preceding vehicle speed distribution generated based on the truncated Ornstein - Uhlenbeck process, which reflects the motion characteristics of the preceding vehicle in the real world and can provide an unlimited amount of training data, thereby improving the generalization ability of the model. 5. **String Stability**: This model also pays special attention to string stability, that is, a platoon of multiple vehicles will not amplify traffic fluctuations during the following process. The research results show that this model can maintain string stability even in extreme cases. In summary, the main objective of this paper is to develop a car - following model based on deep reinforcement learning. This model can not only perform well in multiple driving scenarios, but also ensure safety, comfort, and has good generalization ability and string stability.