Safety Enhancement for Deep Reinforcement Learning in Autonomous Separation Assurance

Wei Guo,Marc Brittain,Peng Wei
DOI: https://doi.org/10.48550/arXiv.2105.02331
2022-02-20
Abstract:The separation assurance task will be extremely challenging for air traffic controllers in a complex and high density airspace environment. Deep reinforcement learning (DRL) was used to develop an autonomous separation assurance framework in our previous work where the learned model advised speed maneuvers. In order to improve the safety of this model in unseen environments with uncertainties, in this work we propose a safety module for DRL in autonomous separation assurance applications. The proposed module directly addresses both model uncertainty and state uncertainty to improve safety. Our safety module consists of two sub-modules: (1) the state safety sub-module is based on the execution-time data augmentation method to introduce state disturbances in the model input state; (2) the model safety sub-module is a Monte-Carlo dropout extension that learns the posterior distribution of the DRL model policy. We demonstrate the effectiveness of the two sub-modules in an open-source air traffic simulator with challenging environment settings. Through extensive numerical experiments, our results show that the proposed sub-safety modules help the DRL agent significantly improve its safety performance in an autonomous separation assurance task.
Artificial Intelligence
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to improve the safety in autonomous separation assurance tasks in a complex and high - density airspace environment. Specifically, the authors have already developed an autonomous separation assurance framework using deep reinforcement learning (DRL) in their previous work, and this framework can suggest speed maneuvers to avoid conflicts between aircraft. However, in unseen environments and in the presence of uncertainties, the safety performance of such a model may be affected. Therefore, this paper proposes a safety component for DRL, aiming to enhance safety by directly dealing with model uncertainty and state uncertainty. The main contributions of the paper can be summarized as follows: 1. A safety - enhancing module named DODA (Dropout and Data Augmentation) is proposed, which can improve the safety performance of general DRL agents without the need for additional training or transfer learning. 2. This module contains two sub - modules, which directly deal with state uncertainty and model uncertainty respectively, and the effectiveness of each sub - module is demonstrated. 3. Through a large number of numerical experiments in an open - source air traffic simulator, the effectiveness of the integrated DODA safety module in complex and challenging ATC environments is proven, and its performance is better than that of DRL agents without the DODA module. The model uncertainty and state uncertainty mentioned in the paper are processed in the following ways: - **Model uncertainty**: It is achieved through the Monte Carlo Dropout (MC - dropout) method, which is a method to approximate the posterior distribution by dropping network units during execution, thereby estimating the uncertainty of model parameters. - **State uncertainty**: It is introduced through the Execution - time Data Augmentation (DA) method, which simulates the possible observation noise, sensor noise or communication noise in practical applications by adding random noise to the input state. The combination of these two sub - modules enables DRL agents to make safer and more robust decisions when facing unseen scenarios and uncertainties.