Imitation Learning for Vision-based Lane Keeping Assistance

Christopher Innocenti,Henrik Lindén,Ghazaleh Panahandeh,Lennart Svensson,Nasser Mohammadiha
DOI: https://doi.org/10.48550/arXiv.1709.03853
2017-09-12
Abstract:This paper aims to investigate direct imitation learning from human drivers for the task of lane keeping assistance in highway and country roads using grayscale images from a single front view camera. The employed method utilizes convolutional neural networks (CNN) to act as a policy that is driving a vehicle. The policy is successfully learned via imitation learning using real-world data collected from human drivers and is evaluated in closed-loop simulated environments, demonstrating good driving behaviour and a robustness for domain changes. Evaluation is based on two proposed performance metrics measuring how well the vehicle is positioned in a lane and the smoothness of the driven trajectory.
Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to implement a lane - keeping assist system using vision - based imitation learning. Specifically, the authors hope that by directly learning from the behaviors of human drivers, vehicles can perform lane - keeping using grayscale images provided by a single front - facing camera on highways and rural roads. ### Main Problems 1. **Lane - keeping Assist**: How can an autonomous vehicle be made to stay safely and stably within the lane by imitating the behavior of human drivers? 2. **Data - driven Approach**: Use the convolutional neural network (CNN) in deep learning as a policy model to directly learn from real - world human driving data without the need for complex sensor combinations or artificially - designed feature extraction. 3. **Model Evaluation**: Verify the robustness and performance of the learned model in different environments, especially how it performs in a simulated environment. ### Solutions - **Imitation Learning**: Adopt direct imitation learning (DIL), that is, learn from the state - action pairs of human drivers and train a CNN model to predict the steering angle. - **Data Processing**: Use a large amount of driving log data provided by Volvo Cars. After pre - processing and cropping, it is used to train the model. - **Model Architecture**: Design a CNN model with five convolutional layers and four fully - connected layers. The input is the grayscale image of the front - facing camera, and the output is the curvature of the vehicle (i.e., the steering angle). - **Evaluation Metrics**: Propose two performance metrics - lane position error and driving smoothness - to quantitatively evaluate the performance of the model. ### Key Formulas 1. **Loss Function**: \[ l(a, \pi_\theta(s))=(a - \pi_\theta(s))^2 \] where \(a\) is the action of the expert (human driver), \(\pi_\theta(s)\) is the action predicted by the model, and \(s\) is the state (i.e., the image). 2. **Expected Loss**: \[ E_{(s,a)\sim d^{\pi^*}}[l(a, \pi_\theta(s))]\approx\frac{1}{N}\sum_{i = 1}^{N}(a_i-\pi_\theta(s_i))^2 \] where \(N\) is the number of state - action pairs in the dataset. 3. **Lane Position Error**: \[ e_\beta(d)= \begin{cases} 1 & \text{if } d < 0\\ \left(\frac{\beta w}{d}\right)^w-\beta d & \text{if } 0\leq d\leq w\\ 0 & \text{if } d > w \end{cases} \] 4. **Discomfort Level**: \[ e_g(x)= \begin{cases} \frac{x^2}{g^2} & \text{if } x < g\\ \frac{5}{6}+\frac{x^2}{6g^2} & \text{if } x\geq g \end{cases} \] where \(x\) is the measured lateral acceleration or jitter, and \(g\) is the comfort threshold. Through these methods and metrics, this research demonstrates the feasibility and effectiveness of implementing a lane - keeping assist system using a single front - facing camera and imitation learning.