Deep Reinforcement Learning for Real-Time Ground Delay Program Revision and Corresponding Flight Delay Assignments

Ke Liu,Fan Hu,Hui Lin,Xi Cheng,Jianan Chen,Jilin Song,Siyuan Feng,Gaofeng Su,Chen Zhu
2024-08-14
Abstract:This paper explores the optimization of Ground Delay Programs (GDP), a prevalent Traffic Management Initiative used in Air Traffic Management (ATM) to reconcile capacity and demand discrepancies at airports. Employing Reinforcement Learning (RL) to manage the inherent uncertainties in the national airspace system-such as weather variability, fluctuating flight demands, and airport arrival rates-we developed two RL models: Behavioral Cloning (BC) and Conservative Q-Learning (CQL). These models are designed to enhance GDP efficiency by utilizing a sophisticated reward function that integrates ground and airborne delays and terminal area congestion. We constructed a simulated single-airport environment, SAGDP_ENV, which incorporates real operational data along with predicted uncertainties to facilitate realistic decision-making scenarios. Utilizing the whole year 2019 data from Newark Liberty International Airport (EWR), our models aimed to preemptively set airport program rates. Despite thorough modeling and simulation, initial outcomes indicated that the models struggled to learn effectively, attributed potentially to oversimplified environmental assumptions. This paper discusses the challenges encountered, evaluates the models' performance against actual operational data, and outlines future directions to refine RL applications in ATM.
Machine Learning
What problem does this paper attempt to address?
The paper attempts to address the problem of optimizing the Ground Delay Program (GDP) in air traffic management. Specifically, the paper aims to improve the efficiency of GDP, reduce flight delays, and optimize airport operational efficiency by developing and applying deep reinforcement learning (Reinforcement Learning, RL) models. The main objectives include: 1. **Reducing flight delays**: By dynamically adjusting the Program Arrival Rate (PAAR) of the airport, transferring airborne delays to the ground, thereby reducing airborne waiting time. 2. **Improving system performance**: Optimizing airport utilization, airline fairness, and air traffic controllers' workload. 3. **Handling uncertainty**: Addressing uncertainties in the National Airspace System, such as weather changes, fluctuations in flight demand, and variations in airport arrival rates. To achieve these goals, the paper constructs a simulated single-airport environment (SAGDP ENV) and develops two RL models: Behavioral Cloning (BC) and Conservative Q-Learning (CQL). These models utilize complex reward functions that comprehensively consider ground delays, airborne delays, and terminal area congestion to achieve optimal decision-making. However, preliminary results indicate that due to overly simplified environmental assumptions, the models encountered difficulties in learning and improvement. The paper discusses these challenges in detail and proposes future research directions to further optimize the application of RL in air traffic management.