Spatiotemporal estimation of 6-hour high-resolution precipitation across China based on Himawari-8 using a stacking ensemble machine learning model

Siqin Zhou,Yuan Wang,Qiangqiang Yuan,Linwei Yue,Liangpei Zhang
DOI: https://doi.org/10.1016/j.jhydrol.2022.127718
IF: 6.4
2022-01-01
Journal of Hydrology
Abstract:Precipitation plays a significant role in the hydrological cycle and atmospheric circulation. However, estimating precipitation is a challenging task due to the high spatiotemporal variability of precipitation. The development of machine learning has marked a new approach to precipitation estimation. In this study, a stacking ensemble machine learning model was developed to estimate 6-hour precipitation at a high spatial resolution of 5 km based on Himawari-8 and ground stations data over China. Metrological and topographic factors are considered ancillary data. The stacking model consists of two levels. Three sub-models, namely, extremely randomized trees, extreme gradient boosting, and deep neural networks, were trained respectively in level-1 of the stacking model. Level-2 linearly combines sub-models to achieve the final estimation. Evaluation results over China indicate that the stacking model outperforms the individual model, with a correlation coefficient of 0.630, a mean absolute error of 1.431 mm/6 hr, and a root mean square error of 4.248 mm/6 hr. The performance of the proposed model is better than several widely used precipitation products (IMERG, GSMaP, and ERA5). In particular, the detection rate of heavy precipitation (>= 16.9 mm/6 hr) in our model, with a probability of detection of 0.97, is distinctly superior to that of the other products. The precipitation estimations are accumulated and plotted at different temporal scales, contrasting with these products. The spatial patterns appear to be relatively coincident at annual and daily scales. More importantly, the spatial patterns of our model are more reliable than those of the products due to its smaller mean bias. The study provided a novel alternative for producing improved high-resolution precipitation datasets with high accuracy.
What problem does this paper attempt to address?