HR-Extreme: A High-Resolution Dataset for Extreme Weather Forecasting

Nian Ran,Peng Xiao,Yue Wang,Wesley Shi,Jianxin Lin,Qi Meng,Richard Allmendinger
2024-09-28
Abstract:The application of large deep learning models in weather forecasting has led to significant advancements in the field, including higher-resolution forecasting and extended prediction periods exemplified by models such as Pangu and Fuxi. Despite these successes, previous research has largely been characterized by the neglect of extreme weather events, and the availability of datasets specifically curated for such events remains limited. Given the critical importance of accurately forecasting extreme weather, this study introduces a comprehensive dataset that incorporates high-resolution extreme weather cases derived from the High-Resolution Rapid Refresh (HRRR) data, a 3-km real-time dataset provided by NOAA. We also evaluate the current state-of-the-art deep learning models and Numerical Weather Prediction (NWP) systems on HR-Extreme, and provide a improved baseline deep learning model called HR-Heim which has superior performance on both general loss and HR-Extreme compared to others. Our results reveal that the errors of extreme weather cases are significantly larger than overall forecast error, highlighting them as an crucial source of loss in weather prediction. These findings underscore the necessity for future research to focus on improving the accuracy of extreme weather forecasts to enhance their practical utility.
Machine Learning
What problem does this paper attempt to address?
The paper aims to address the accuracy issues in extreme weather forecasting. Despite significant advancements in weather forecasting achieved by deep learning models (such as Pangu and Fuxi) in recent years, these models still exhibit considerable errors when predicting extreme weather events (such as hurricanes, tornadoes, and severe storms). The current datasets either lack sufficient extreme weather cases or are not in a format suitable for deep learning models. Therefore, this study proposes a high-resolution dataset named HR-Extreme, based on the 3-kilometer resolution High-Resolution Rapid Refresh (HRRR) data provided by NOAA. This dataset includes various types of extreme weather (such as strong winds, heavy rain, hail, tornadoes, and extreme temperatures) and offers an improved baseline model, HR-Heim, by evaluating existing state-of-the-art deep learning models and numerical weather prediction systems (NWP). Experimental results show that all models have significantly higher errors when dealing with extreme weather events compared to general weather forecasts, highlighting the urgency of improving the accuracy of extreme weather forecasting. Additionally, the paper discusses the dataset construction process and its limitations, emphasizing the uncertainty in identifying the extent of extreme weather based on user reports.