Cloud classification through machine learning and global horizontal irradiance data analysis

Anabela Rocío Lusi,Pablo Facundo Orte,Elian Wolfram,José Ignacio Orlando
DOI: https://doi.org/10.1002/qj.4880
2024-10-26
Quarterly Journal of the Royal Meteorological Society
Abstract:This study presents a cost‐effective machine‐learning model for automated cloud‐type classification using ground‐based global horizontal irradiance (GHI) measurements and a clear‐sky model. Data labeling of cloud types was obtained from all‐sky images by meteorological observers. Eight features were extracted from the GHI "signature" into a determined time window around the datetime of each all‐sky image. The XGBoost classifier on 33‐min windows was the most effective, achieving a remarkable accuracy of 0.88 and a Cohen's kappa of 0.84. Cloud observations and characterization are crucial owing to their influence on energy balance, climate, and weather. Their particular effects on radiation vary depending on different cloud parameters, such as cloud base or top height, water content, and cloud optical thickness, all of them closely related to the specific cloud type. Cloud classification therefore becomes a crucial task in meteorology, although it remains challenging for weather services worldwide owing to the intensive associated labor and cost. In this study we introduce a new low‐cost method for automating cloud classification based on a combination of ground‐based global horizontal irradiance (GHI) measurements, a clear‐sky model, and machine learning. Based on the hypothesis that different cloud types have their own GHI signatures, we trained different supervised learning algorithms using GHI data manually labeled by meteorological observers from time‐synchronized all‐sky images. Multiple time windows were extracted from each GHI series, with eight features defined in each case to characterize the sequence. The best outcome was achieved using an XGBoost model on features extracted on time windows of 33 min, obtaining an accuracy of 0.88 and a Cohen's kappa of 0.84 in a held‐out test set. The development presented in this study has the ability to provide low‐cost cloud classification from ground‐based observations, which is a challenge for weather services worldwide owing to intensive labor and cost.
meteorology & atmospheric sciences
What problem does this paper attempt to address?