Racing Towards Reinforcement Learning based control of an Autonomous Formula SAE Car

Aakaash Salvaji,Harry Taylor,David Valencia,Trevor Gee,Henry Williams
2023-08-25
Abstract:With the rising popularity of autonomous navigation research, Formula Student (FS) events are introducing a Driverless Vehicle (DV) category to their event list. This paper presents the initial investigation into utilising Deep Reinforcement Learning (RL) for end-to-end control of an autonomous FS race car for these competitions. We train two state-of-the-art RL algorithms in simulation on tracks analogous to the full-scale design on a Turtlebot2 platform. The results demonstrate that our approach can successfully learn to race in simulation and then transfer to a real-world racetrack on the physical platform. Finally, we provide insights into the limitations of the presented approach and guidance into the future directions for applying RL toward full-scale autonomous FS racing.
Robotics,Artificial Intelligence,Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to achieve end - to - end control of the Formula Student (FS) racing car in autonomous driving using Deep Reinforcement Learning (RL). Specifically, the paper explores training two state - of - the - art RL algorithms (TD3 and DQN) in a simulated environment and testing the performance of these algorithms on an autonomous FS racing car in the real world to verify their applicability and performance on the actual track. In this way, the research aims to overcome the limitations of traditional path - planning methods and improve the adaptability of robots in dynamic and unpredictable environments, especially for the autonomous racing car scenario. ### Background of the Paper and Related Work - **Reinforcement Learning (RL)**: As a branch of artificial intelligence, RL learns appropriate control strategies through direct interaction with the environment and can adapt to new situations without human - designed solutions. This method shows great potential in robot control problems. - **Mobile Robot Control**: Although traditional methods have been successful in solving the motion planning and control problems of mobile robots, they usually require a great deal of engineering effort to be reliably deployed in the real world. Machine - learning - based navigation methods, especially Deep Reinforcement Learning (DRL), have been proposed as a way to reduce this manual engineering effort. - **Challenges**: The main challenges faced by DRL methods include performance verification, narrowing the gap between simulation and reality, sample efficiency, designing practical reward functions, and ensuring safety. ### Methods - **Experimental Setup**: The Turtlebot2 platform equipped with a Realsense D435 camera is used for simulation and real - world training. To simplify the computer vision challenges, the conical signs on the track are replaced with ArUco markers. - **State Space and Action Space**: - **State Space**: Consists of the positions of the six nearest ArUco markers detected by the RealSense D435 camera. The position information includes the lateral distance (x) and the forward distance (z) of the markers relative to the Turtlebot2. - **Action Space**: The action space of the DQN model is discrete positive and negative fixed rotation speeds (±0.2 rad/s), while the action space of the TD3 model is continuous rotation speeds (- 0.4 rad/s to 0.4 rad/s). - **Reward Function**: Based on the angle difference between the current direction of the robot and the direction of the midpoint of the nearest pair of markers, a cosine function is used to define the reward to encourage the robot to stay on the center line of the track. ### Results - **Simulation Training Results**: The TD3 model begins to stabilize after 2000 training cycles, while the average reward of the DQN model is still increasing after 5000 cycles. - **Simulation Track Segment Tests**: The TD3 model shows higher success rates and completion degrees on various combinations of track segments. - **Real - World Track Segment Tests**: The TD3 model performs better than the DQN model in the real world, especially when turning left. - **Oval Track Simulation**: The TD3 model also performs more stably on the oval track, but all models have difficulties in completing the entire track, mainly having problems when turning. ### Discussion - **Rotation Control Jitter**: The models show high jitter when turning, which may be because the reward function is based only on the angle rather than the rate of change of the action. Introducing a complex reward function that evaluates the smoothness of rotation control may significantly reduce the jitter. - **Model Evaluation**: The TD3 model outperforms the DQN model in all tests, especially the TD3 model trained without noise performs the best. Adding noise has little impact on performance, and in some cases, the no - noise model performs even better. ### Conclusion This research shows the preliminary results of using deep reinforcement learning to achieve end - to - end control of autonomous FS racing cars, but further improvement is still needed, especially in reducing rotation control jitter and improving performance on complex tracks. Future research can explore more complex reward functions and more training data to further optimize the model performance.