A multi-head attention-based transformer model for traffic flow forecasting with a comparative analysis to recurrent neural networks

Selim Reza,Marta Campos Ferreira,J.J.M. Machado,João Manuel R.S. Tavares
DOI: https://doi.org/10.1016/j.eswa.2022.117275
IF: 8.5
2022-09-01
Expert Systems with Applications
Abstract:Traffic flow forecasting is an essential component of an intelligent transportation system to mitigate congestion. Recurrent neural networks, particularly gated recurrent units and long short-term memory, have been the state-of-the-art traffic flow forecasting models for the last few years. However, a more sophisticated and resilient model is necessary to effectively acquire long-range correlations in the time-series data sequence under analysis. The dominant performance of transformers by overcoming the drawbacks of recurrent neural networks in natural language processing might tackle this need and lead to successful time-series forecasting. This article presents a multi-head attention based transformer model for traffic flow forecasting with a comparative analysis between a gated recurrent unit and a long-short term memory-based model on PeMS dataset in this context. The model uses 5 heads with 5 identical layers of encoder and decoder and relies on Square Subsequent Masking techniques. The results demonstrate the promising performance of the transform-based model in predicting long-term traffic flow patterns effectively after feeding it with substantial amount of data. It also demonstrates its worthiness by increasing the mean squared errors and mean absolute percentage errors by (1.25−47.8)% and (32.4−83.8)%, respectively, concerning the current baselines.
computer science, artificial intelligence,engineering, electrical & electronic,operations research & management science
What problem does this paper attempt to address?