Cloud Segmentation, Validation of Weather Data, and Precipitation Prediction Using Machine Learning Algorithms
Nagaraj Rajendiran,Sruthy Sebastian,Lakshmi Sutha Kumar
DOI: https://doi.org/10.1007/s13369-023-08611-0
IF: 2.807
2024-01-20
Arabian Journal for Science and Engineering
Abstract:Precipitation is momentous role in the hydrological cycle, crop irrigation, drinking, and ecosystem maintenance. Excessive and inadequate precipitation leads to flooding, landslides, loss of human lives and properties, and drought. Therefore, precipitation prediction is necessary for planning and protecting water resources and handling natural disasters like floods and drought. This paper proposed a cloud segmentation and precipitation prediction using Machine Learning (ML) algorithms. ML cloud segmentation model is implemented to segment the INSAT-3D Thermal Infra-Red (TIR) image into No-cloud, Low-Level (L-L), Mid-Level (M-L), and High-Level (H-L) clouds and determine the percentage of different clouds over the study area. Fuzzy C-means(FCM) clustering, Gaussian Naïve Bayes (GNB), K -Nearest Neighbor (KNN), Decision Tree (DT), Random Forest (RF), Support Vector Machine (SVM), Adaptive Boosting (AdaBoost), eXtreme Gradient Boosting (XGBoost), Light Gradient Boosting Machine (LightGBM), and Categorical Boosting (CatBoost) are the ML algorithms considered for cloud segmentation. The precipitation intensity was predicted on the weather dataset using ML regression models: KNN, DT, and RF. The estimated cloud percentage is validated with cloud cover attributes of the weather dataset. Qualitative and quantitative results of cloud segmentation show that XGBoost, LightGBM, and CatBoost are the best performing models. Similarly, quantitative results of precipitation prediction show that RF and KNN are the best and least performing ML regression models. Also, the performance of the ML regression model is improved by including dimensionality reduction techniques such as Principal Component Analysis (PCA), t-distributed Stochastic Neighbor Embedding (t-SNE), and Independent Component Analysis (ICA).
multidisciplinary sciences