Abstract:The aviation industry has experienced constant growth in air traffic since the deregulation of the U.S. airline industry in 1978. As a result, flight delays have become a major concern for airlines and passengers, leading to significant research on factors affecting flight delays such as departure, arrival, and total delays. Flight delays result in increased consumption of limited resources such as fuel, labor, and capital, and are expected to increase in the coming decades. To address the flight delay problem, this research proposes a hybrid approach that combines the feature of deep learning and classic machine learning techniques. In addition, several machine learning algorithms are applied on flight data to validate the results of proposed model. To measure the performance of the model, accuracy, precision, recall, and F1-score are calculated, and ROC and AUC curves are generated. The study also includes an extensive analysis of the flight data and each model to obtain insightful results for U.S. airlines.

What problem does this paper attempt to address?

The paper attempts to address the problem of predicting flight delays for American airlines. Specifically, the researchers focused on departure delays, arrival delays, and total delays, and proposed a hybrid approach that combines deep learning and classical machine learning techniques to solve these issues. By analyzing a large amount of flight data, the paper validates the effectiveness of the proposed model and evaluates the performance of different models. ### Research Background Since the deregulation of the U.S. aviation industry in 1978, air traffic has continued to grow, making flight delays a major issue for airlines and passengers. Flight delays not only increase the consumption of limited resources such as fuel, labor, and capital but may also lead to further deterioration of delay situations in the coming decades. Therefore, studying the causes and prediction methods of flight delays is of great significance. ### Research Objectives 1. **Classify Flight Delay Problems**: Divide the flight delay problem into three sub-problems: departure delay, arrival delay, and total delay. 2. **Develop a Hybrid Approach**: Propose a new hybrid approach that combines deep learning and classical machine learning techniques to predict flight delays. 3. **Validate Model Performance**: Use various machine learning algorithms to validate the effectiveness of the proposed method and evaluate model performance through metrics such as accuracy, precision, recall, and F1 score. 4. **Data Analysis**: Conduct a detailed analysis of flight data from American airlines to extract key features and insights. ### Main Contributions 1. **Literature Review**: Reviewed existing literature and summarized the current state of flight delay research. 2. **Data Collection**: Collected 27 months of flight data from American airlines, covering multiple factors. 3. **Data Analysis**: Conducted a detailed analysis of the flight data and generated various charts to showcase key insights. 4. **Hybrid Approach**: Developed a new method that combines deep learning and traditional machine learning techniques to predict flight delays for American airlines. ### Methods 1. **Fully Connected Neural Network (FCNN)**: Used to extract high-dimensional feature representations from the data. 2. **Random Forest**: Used for classification tasks to improve the model's generalization ability. 3. **XGBoost**: Optimizes the model's training loss and regularization terms through the gradient boosting tree method. 4. **Hybrid Approach**: Uses the output of FCNN as feature input to Random Forest and XGBoost, combining the advantages of both. ### Experimental Results 1. **Departure Delay**: XGBoost performed best in terms of accuracy, F1 score, and precision, while FCNN + Random Forest performed better in terms of recall. 2. **Arrival Delay**: XGBoost excelled in all metrics, while FCNN + Random Forest performed slightly better in terms of recall. 3. **Total Delay**: FCNN + Random Forest excelled in all metrics, particularly achieving an AUC value of 0.97. ### Conclusion The proposed method in the paper achieved significant results in predicting flight delays, especially in the total delay task. By combining deep learning and classical machine learning techniques, the model can better capture complex patterns in the data, improving prediction accuracy. These results have important practical implications for airlines to optimize operations, reduce delays, and improve customer satisfaction.

Flight Delay Prediction using Hybrid Machine Learning Approach: A Case Study of Major Airlines in the United States

A hybrid machine learning-based model for predicting flight delay through aviation big data

A novel intelligent approach for flight delay prediction

Machine Learning Approach for Flight Departure Delay Prediction and Analysis

Prediction of Flight Delay through Intelligent Algorithms and Big Data Technology

PREDICTION OF FUTURE FLIGHT DELAYS BASED ON CURRENT DATA ANALYSIS USING MACHINE LEARNING

Deciphering Air Travel Disruptions: A Machine Learning Approach

Review of Prediction of Delay in Flights using Machine Learning Techniques

Empirical Study on Airline Delay Analysis and Prediction

Predicting Flight Delays with Machine Learning: A Case Study from Saudi Arabian Airlines

A Machine Learning Based Approach for Prediction of Actual Landing Time of Scheduled Flights

A Data Mining Approach to Flight Arrival Delay Prediction for American Airlines

Machine Learning Techniques to Predict Reactionary Delays and Other Associated Key Performance Indicators on British Railway Network

Prediction of flight delay using deep operator network with gradient-mayfly optimisation algorithm

Flight Delay Prediction using Airport Situational Awareness Map

A Flight Fare Prediction Using Machine Learning

Prediction of US airline passenger satisfaction using machine learning algorithms

Social ski driver conditional autoregressive-based deep learning classifier for flight delay prediction

Alexa, Predict My Flight Delay

A spatial–temporal model for network-wide flight delay prediction based on federated learning

A CNN-LSTM framework for flight delay prediction