Potential of kernel and tree-based machine-learning models for estimating missing data of rainfall

Mohammad Taghi Sattari,Kambiz Falsafian,Ahmet Irvem,Shahab S,Sultan Noman Qasem
DOI: https://doi.org/10.1080/19942060.2020.1803971
IF: 6.519
2020-01-01
Engineering Applications of Computational Fluid Mechanics
Abstract:In this study, two kernel-based models were used which include Support Vector Regression (SVR) and Gaussian Process Regression (GPR) and were compared with two tree-based models that are M5 and Random Forest (RF) for estimating missing monthly precipitation data in Antakya, Dortyol, Iskenderun and Samandag stations, which are the important precipitation stations in the Eastern Mediterranean region, Turkey. For this purpose, firstly 10% random precipitation data were assumed as missing data for the period 1980-2019. Secondly, the missing data in each station was estimated with the data of other stations within the framework of four data combinations scenarios. In Kernel-based SVR and GPR methods, the RBF kernel gave suitable results for the selected study area. While SVR and RF methods gave very close estimation results, the SVR method gave relatively better results than the other methods especially in error minimizing aspects. Gaussian function based GPR model generally tries to estimate missing data closer to means. This is the main disadvantage of the GPR model and therefore it is unsuccessful in the estimation process. Finally, the results showed that the algorithms based on machine learning are successful in estimating the missing precipitation data.
engineering, mechanical,mechanics, multidisciplinary
What problem does this paper attempt to address?