Prediction and analysis of train arrival delay based on XGBoost and Bayesian optimization

Rui Shi,Xinyue Xu,Jianmin Li,Yanqiu Li
DOI: https://doi.org/10.1016/j.asoc.2021.107538
IF: 8.7
2021-09-01
Applied Soft Computing
Abstract:<p>Accurate train arrival delay prediction is critical for real-time train dispatching and for the improvement of the transportation service. This study proposes a data-driven method that combines <em>eXtreme Gradient Boosting</em> (XGBoost) and a Bayesian optimization (BO) algorithm to predict train arrival delays. First, eleven characteristics that may affect the train arrival time at the next scheduled station are identified as independent variables. Second, an XGBoost prediction model that captures the relation between the train arrival delays and various railway system characteristics is established. Third, the BO algorithm is applied to the hyperparameter optimization of the XGBoost model to improve the prediction accuracy. Subsequently, case studies using data from two high-speed railway (HSR) lines in China are performed to analyze the prediction efficiency and accuracy of the proposed model for different delay bins and at different stations. The results on two HSR lines demonstrate that the proposed method outperforms other benchmark models regarding the performance metrics of the determination coefficient (0.9889/0.9905), root-mean-squared error (2.686/1.887), and mean absolute error (0.896/ 0.802). In addition, the statistical test is carried out using Friedman Test (FT) and Wilcoxon Signed Rank Test (WSRT) to validate the efficacy of the proposed method. Furthermore, the train arrival delays at different abnormal events can also be accurately forecasted using the proposed method; the results indicate that the proposed method outperforms other benchmark methods, especially in the prediction of long delays caused by specific abnormal events.</p>
computer science, artificial intelligence, interdisciplinary applications
What problem does this paper attempt to address?