Prediction of heat waves using meteorological variables in diverse regions of Iran with advanced machine learning models

Seyed Babak Haji Seyed Asadollah,Najeebullah Khan,Ahmad Sharafati,Shamsuddin Shahid,Eun-Sung Chung,Xiao-Jun Wang
DOI: https://doi.org/10.1007/s00477-021-02103-z
IF: 3.821
2021-10-04
Stochastic Environmental Research and Risk Assessment
Abstract:Climate change has caused a rise in temperature extremes, particularly heatwaves, in recent decades. Physical-empirical models are developed in this study using two classical machine learning algorithms, namely decision tree (DT) and random forests, and a novel hybrid technique consists of Ada-Boost Regression and decision tree (ABR-DT) for forecasting annual heatwave days (HWDs) of Iran from synoptic predictors. The daily temperature data of Princeton Meteorological Forcing were extracted for 27 points to estimate the annual number of HWDs, and the National Centers for Environmental Prediction reanalysis data were used as predictors. The major synoptic variables were extracted for four pressure levels (e.g., 300, 500, 850, and 1000 hPa) and three monthly time lags. The Principal Component Analysis was employed to reduce the diverse predictors and their features to the most optimal structure. The grid point-based performance evaluation revealed the superiority of ABR-DT, which showed a correlation coefficient (CC) of 0.860 and meant absolute error (MAE) of 6.929, using only specific humidity and wind component as predictors. The spatial performance indices over eight different climate regions of Iran also showed the better performance of ABR-DT, which improved the CC and MAE of its two alternatives up to 185 and 19%. The study identified the optimal parameter combination as the predictors of heatwaves by examining the effects of numerous weather components. The results proved the proposed hybrid forecasting approach's effectiveness in predicting heatwave days, a devastating hazard, for many regions.
environmental sciences,engineering, environmental,water resources, civil,statistics & probability
What problem does this paper attempt to address?