CARNet: A Dynamic Autoencoder for Learning Latent Dynamics in Autonomous Driving Tasks

Andrey Pak,Hemanth Manjunatha,Dimitar Filev,Panagiotis Tsiotras
DOI: https://doi.org/10.48550/arXiv.2205.08712
2022-05-27
Abstract:Autonomous driving has received a lot of attention in the automotive industry and is often seen as the future of transportation. Passenger vehicles equipped with a wide array of sensors (e.g., cameras, front-facing radars, LiDARs, and IMUs) capable of continuous perception of the environment are becoming increasingly prevalent. These sensors provide a stream of high-dimensional, temporally correlated data that is essential for reliable autonomous driving. An autonomous driving system should effectively use the information collected from the various sensors in order to form an abstract description of the world and maintain situational awareness. Deep learning models, such as autoencoders, can be used for that purpose, as they can learn compact latent representations from a stream of incoming data. However, most autoencoder models process the data independently, without assuming any temporal interdependencies. Thus, there is a need for deep learning models that explicitly consider the temporal dependence of the data in their architecture. This work proposes CARNet, a Combined dynAmic autoencodeR NETwork architecture that utilizes an autoencoder combined with a recurrent neural network to learn the current latent representation and, in addition, also predict future latent representations in the context of autonomous driving. We demonstrate the efficacy of the proposed model in both imitation and reinforcement learning settings using both simulated and real datasets. Our results show that the proposed model outperforms the baseline state-of-the-art model, while having significantly fewer trainable parameters.
Machine Learning,Robotics
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to effectively learn the current and future latent representations from high - dimensional, time - related data in the autonomous driving task. Specifically, the paper points out that most of the existing auto - encoder models do not consider the time - dependence of data when processing data, which leads to performance loss, especially on data generated by dynamic systems. Therefore, the paper proposes a new model - CARNet (Combined dynAmic autoencodeR NETwork), which combines auto - encoders and recurrent neural networks (RNN) and aims to explicitly consider the time - dependence of data, so that it can not only learn the current latent representation but also predict the future latent representation. In addition, the paper also explores the influence of the attention mechanism on estimating latent vectors from images to further improve the performance of the model. In this way, CARNet can complete the autonomous driving task more effectively in imitation learning and reinforcement learning settings using simulated and real - world data sets.