AirPilot: Interpretable PPO-based DRL Auto-Tuned Nonlinear PID Drone Controller for Robust Autonomous Flights

Junyang Zhang,Cristian Emanuel Ocampo Rivera,Kyle Tyni,Steven Nguyen,Ulices Santa Cruz Leal,Yasser Shoukry
2024-09-01
Abstract:Navigation precision, speed and stability are crucial for safe Unmanned Aerial Vehicle (UAV) flight maneuvers and effective flight mission executions in dynamic environments. Different flight missions may have varying objectives, such as minimizing energy consumption, achieving precise positioning, or maximizing speed. A controller that can adapt to different objectives on the fly is highly valuable. Proportional Integral Derivative (PID) controllers are one of the most popular and widely used control algorithms for drones and other control systems, but their linear control algorithm fails to capture the nonlinear nature of the dynamic wind conditions and complex drone system. Manually tuning the PID gains for various missions can be time-consuming and requires significant expertise. This paper aims to revolutionize drone flight control by presenting the AirPilot, a nonlinear Deep Reinforcement Learning (DRL) - enhanced Proportional Integral Derivative (PID) drone controller using Proximal Policy Optimization (PPO). AirPilot controller combines the simplicity and effectiveness of traditional PID control with the adaptability, learning capability, and optimization potential of DRL. This makes it better suited for modern drone applications where the environment is dynamic, and mission-specific performance demands are high. We employed a COEX Clover autonomous drone for training the DRL agent within the simulator and implemented it in a real-world lab setting, which marks a significant milestone as one of the first attempts to apply a DRL-based flight controller on an actual drone. Airpilot is capable of reducing the navigation error of the default PX4 PID position controller by 90%, improving effective navigation speed of a fine-tuned PID controller by 21%, reducing settling time and overshoot by 17% and 16% respectively.
Robotics,Machine Learning,Systems and Control
What problem does this paper attempt to address?
The problem this paper attempts to address is: In dynamic and unpredictable environments, Unmanned Aerial Vehicles (UAVs) require high-precision, fast, and stable navigation performance. However, traditional Proportional-Integral-Derivative (PID) controllers, due to their linear control algorithms, cannot effectively cope with complex nonlinear dynamic conditions, especially turbulent wind environments and complex task requirements. Manually adjusting PID parameters to suit different tasks is also very time-consuming and requires professional knowledge. Therefore, this paper proposes an adaptive PID controller based on Deep Reinforcement Learning (DRL) — AirPilot, aiming to improve the navigation accuracy, speed, and stability of UAVs in complex dynamic environments by combining the simplicity and effectiveness of PID controllers with the learning and optimization capabilities of DRL. Specifically, the AirPilot controller utilizes the Proximal Policy Optimization (PPO) algorithm, which can adjust PID parameters in real-time to adapt to different flight tasks and environmental changes. Additionally, the research team integrated a 3D A* path planning algorithm to ensure that the UAV can fly along the shortest collision-free trajectory and used a high-precision indoor Vicon tracking system to improve positioning accuracy. Experimental results show that compared to the default PX4 PID position controller, the AirPilot controller can reduce navigation errors by 90%, increase effective navigation speed by 21%, and reduce stabilization time and overshoot by 17% and 16%, respectively. These improvements make the AirPilot controller significantly advantageous in modern UAV applications.