Continuous Improvement of Self-Driving Cars Using Dynamic Confidence-Aware Reinforcement Learning

Zhong Cao,Kun Jiang,Weitao Zhou,Shaobing Xu,Huei Peng,Diange Yang
DOI: https://doi.org/10.1038/s42256-023-00610-y
IF: 23.8
2023-01-01
Nature Machine Intelligence
Abstract:Today’s self-driving vehicles have achieved impressive driving capabilities, but still suffer from uncertain performance in long-tail cases. Training a reinforcement-learning-based self-driving algorithm with more data does not always lead to better performance, which is a safety concern. Here we present a dynamic confidence-aware reinforcement learning (DCARL) technology for guaranteed continuous improvement. Continuously improving means that more training always improves or maintains its current performance. Our technique enables performance improvement using the data collected during driving, and does not need a lengthy pre-training phase. We evaluate the proposed technology both using simulations and on an experimental vehicle. The results show that the proposed DCARL method enables continuous improvement in various cases, and, in the meantime, matches or outperforms the default self-driving policy at any stage. This technology was demonstrated and evaluated on the vehicle at the 2022 Beijing Winter Olympic Games. Reinforcement learning is a powerful technique to learn complex behaviours, but in the context of self-driving vehicles it might result in unsafe behaviour in previously unseen situations. Cao et al. create a confidence-aware method that improves through reinforcement learning but reverts to safe behaviour when a situation is new.
What problem does this paper attempt to address?