Ruiqi Zhang,Dingqi Zhang,Mark W. Mueller
Abstract:This paper proposes the ProxFly, a residual deep Reinforcement Learning (RL)-based controller for close proximity quadcopter flight. Specifically, we design a residual module on top of a cascaded controller (denoted as basic controller) to generate high-level control commands, which compensate for external disturbances and thrust loss caused by downwash effects from other quadcopters. First, our method takes only the ego state and controllers' commands as inputs and does not rely on any communication between quadcopters, thereby reducing the bandwidth requirement. Through domain randomization, our method relaxes the requirement for accurate system identification and fine-tuned controller parameters, allowing it to adapt to changing system models. Meanwhile, our method not only reduces the proportion of unexplainable signals from the black box in control commands but also enables the RL training to skip the time-consuming exploration from scratch via guidance from the basic controller. We validate the effectiveness of the residual module in the simulation with different proximities. Moreover, we conduct the real close proximity flight test to compare ProxFly with the basic controller and an advanced model-based controller with complex aerodynamic compensation. Finally, we show that ProxFly can be used for challenging quadcopter in-air docking, where two quadcopters fly in extreme proximity, and strong airflow significantly disrupts flight. However, our method can stabilize the quadcopter in this case and accomplish docking. The resources are available at <a class="link-external link-https" href="https://github.com/ruiqizhang99/ProxFly" rel="external noopener nofollow">this https URL</a>.
What problem does this paper attempt to address?
### Problems the paper attempts to solve
The paper "ProxFly: Robust Control for Close Proximity Quadcopter Flight via Residual Reinforcement Learning" aims to solve the control problems of multi - rotor unmanned aerial vehicles (UAVs) in close - range flight. Specifically, the paper proposes a controller named ProxFly based on Residual Deep Reinforcement Learning (Residual RL) to address the following challenges:
1. **Complex aerodynamic effects**: When multi - rotor UAVs fly in close proximity, complex aerodynamic interactions will occur, such as the downwash effect of the upper UAV on the lower UAV. This effect will lead to thrust loss of the lower UAV, complex external forces and moments, which are difficult to be accurately modeled by traditional model - control methods.
2. **Uncertainty of system parameters**: Traditional model - control methods rely on accurate system parameter identification and fine - tuning of controller parameters, which are often difficult to achieve in practical applications. In addition, these methods are computationally intensive and difficult to be transferred to different UAV models.
3. **Dynamics changes of multi - rotor UAVs**: In close - range flight tasks, the dynamic characteristics of the system will change significantly. Traditional reinforcement learning methods are usually only applicable to specific model configurations and tasks, so they are not suitable for scenarios with large dynamics changes.
4. **Explanatory problems of black - box control**: Traditional black - box control methods are difficult to explain the control signals they generate, which is an important problem in practical applications.
### Solutions
To address the above challenges, the paper proposes the ProxFly controller, which has the following main features:
- **Residual module**: ProxFly adds a residual module on the basis of the traditional cascaded controller. This module generates high - level control commands through deep reinforcement learning to compensate for thrust loss caused by external disturbances and downwash effects.
- **No need for communication**: This method only depends on the state of the UAV itself and the commands of the controller, and does not need to communicate with other UAVs, thus reducing the bandwidth requirements.
- **Domain randomization**: Through the domain randomization technique, ProxFly can relax the requirements for accurate system identification and controller parameter tuning, enabling it to adapt to different system models.
- **Accelerate the learning process**: The residual module can obtain guidance from the output of the basic controller, thus skipping the time - consuming exploration process from scratch and improving the learning efficiency.
- **Enhanced robustness**: ProxFly not only improves the position and attitude control accuracy of the basic controller, but also can maintain stability in extremely close - range flight tasks, such as in - air docking tasks.
### Experimental verification
The paper verifies the effectiveness of ProxFly through simulation and actual flight tests. The experiments include:
- **Close - range hovering**: A small UAV (SQ) hovers 50 centimeters above a large UAV (LQ) for 10 seconds to verify the robustness of position and attitude control.
- **Circular flight in the same direction**: SQ is 50 centimeters above LQ, and the two UAVs track a circular trajectory with a diameter of 1.5 meters counterclockwise with a period of 7.5 seconds to evaluate the robustness of the controller under continuous airflow interference and the trajectory tracking accuracy.
- **Circular flight in the opposite direction**: To verify the performance of the controller in different directions.
The experimental results show that ProxFly is significantly superior to the basic controller in terms of position and attitude control errors and performs well in difficult tasks such as in - air docking.