Reinforcement Learning-Based Control of CrazyFlie 2.X Quadrotor

Arshad Javeed,Valentín López Jiménez
2023-06-14
Abstract:The objective of the project is to explore synergies between classical control algorithms such as PID and contemporary reinforcement learning algorithms to come up with a pragmatic control mechanism to control the CrazyFlie 2.X quadrotor. The primary objective would be performing PID tuning using reinforcement learning strategies. The secondary objective is to leverage the learnings from the first task to implement control for navigation by integrating with the lighthouse positioning system. Two approaches are considered for navigation, a discrete navigation problem using Deep Q-Learning with finite predefined motion primitives, and deep reinforcement learning for a continuous navigation approach. Simulations for RL training will be performed on gym-pybullet-drones, an open-source gym-based environment for reinforcement learning, and the RL implementations are provided by stable-baselines3
Robotics,Machine Learning
What problem does this paper attempt to address?
The main objective of this paper is to explore the synergy between classical control algorithms (such as PID) and modern reinforcement learning algorithms to achieve effective control of the CrazyFlie 2.X quadrotor drone. Specifically, the paper attempts to address the following issues: 1. **PID Parameter Tuning**: Utilize reinforcement learning strategies to adjust the parameters of the PID controller in the CrazyFlie 2.X quadrotor drone. The Twin-Deep Deterministic Policy Gradient (TD3) algorithm is used to approximate the parameters of the attitude and position controllers and compare them with the original values. 2. **Navigation Control**: Based on PID parameter tuning, further achieve navigation control functionality. Combining high-precision positioning systems (such as Lighthouse), use reinforcement learning strategies to complete spatial navigation tasks from the current position to a specified target point. The paper explores two methods: - Use Deep Q-Learning with a finite set of predefined action primitives to solve discrete navigation problems. - Use deep reinforcement learning to solve continuous navigation problems. 3. **Robustness Evaluation**: Investigate whether introducing external disturbances during training can improve the model's robustness. Experimental results show that even without introducing external disturbances, the model itself has a certain degree of robustness, and adding disturbances during training does not significantly enhance performance. In summary, the paper aims to develop a practical and efficient quadrotor drone control system by combining traditional PID control techniques with advanced reinforcement learning methods.