Do AI models produce better weather forecasts than physics-based models? A quantitative evaluation case study of Storm Ciarán

Andrew J. Charlton-Perez,Helen F. Dacre,Simon Driscoll,Suzanne L. Gray,Ben Harvey,Natalie J. Harvey,Kieran M. R. Hunt,Robert W. Lee,Ranjini Swaminathan,Remy Vandaele,Ambrogio Volonté
DOI: https://doi.org/10.1038/s41612-024-00638-w
IF: 9.4475
2024-04-23
npj Climate and Atmospheric Science
Abstract:There has been huge recent interest in the potential of making operational weather forecasts using machine learning techniques. As they become a part of the weather forecasting toolbox, there is a pressing need to understand how well current machine learning models can simulate high-impact weather events. We compare short to medium-range forecasts of Storm Ciarán, a European windstorm that caused sixteen deaths and extensive damage in Northern Europe, made by machine learning and numerical weather prediction models. The four machine learning models considered (FourCastNet, Pangu-Weather, GraphCast and FourCastNet-v2) produce forecasts that accurately capture the synoptic-scale structure of the cyclone including the position of the cloud head, shape of the warm sector and location of the warm conveyor belt jet, and the large-scale dynamical drivers important for the rapid storm development such as the position of the storm relative to the upper-level jet exit. However, their ability to resolve the more detailed structures important for issuing weather warnings is more mixed. All of the machine learning models underestimate the peak amplitude of winds associated with the storm, only some machine learning models resolve the warm core seclusion and none of the machine learning models capture the sharp bent-back warm frontal gradient. Our study shows there is a great deal about the performance and properties of machine learning weather forecasts that can be derived from case studies of high-impact weather events such as Storm Ciarán.
meteorology & atmospheric sciences
What problem does this paper attempt to address?
The paper aims to explore the performance differences between machine learning (ML) models and traditional physics-based numerical weather prediction (NWP) models in forecasting high-impact weather events. Specifically, the study compares and analyzes the forecasting effectiveness of four different machine learning models (FourCastNet, Pangu-Weather, GraphCast, and FourCastNet-v2) with traditional numerical weather prediction models for Storm Ciarán, which occurred in Europe in November 2023. The study found: 1. **Large-scale structure prediction**: All machine learning models were able to accurately capture the large-scale structural features of the storm, such as the position of the cyclone center, the location of the cloud head, the shape of the warm sector, and important dynamic drivers (e.g., the position of the upper-level jet exit). 2. **Detailed structure prediction**: However, in terms of predicting more detailed structures needed for issuing weather alerts, the performance of the machine learning models was mixed. All models underestimated the peak wind speeds brought by the storm. Some models were able to identify the warm core isolation phenomenon, but none could capture the steep gradient changes of the fronts. 3. **Wind speed intensity**: Although the machine learning models were able to simulate the rapid development phase of the storm and its maximum intensity relatively well, they generally underestimated the maximum surface wind speeds. This is particularly important in economic loss assessments, as even a slight underestimation of predicted wind speeds can lead to significant differences in economic loss estimates. In summary, the paper aims to reveal the advantages and shortcomings of current machine learning weather forecasting models compared to traditional numerical forecasting models through a specific case study, particularly in forecasting high-impact weather events such as Storm Ciarán.