Enhancing environmental data imputation: A physically-constrained machine learning framework

Marcos Pastorini,Rafael Rodríguez,Lorena Etcheverry,Alberto Castro,Angela Gorgoglione
DOI: https://doi.org/10.1016/j.scitotenv.2024.171773
IF: 9.8
2024-03-30
The Science of The Total Environment
Abstract:In water resources management, new computational capabilities have made it possible to develop integrated models to jointly analyze climatic conditions and water quantity/quality of the entire watershed system. Although the value of this integrated approach has been demonstrated so far, the limited availability of field data may hinder its applicability by causing high uncertainty in the model response. In this context, before collecting additional data, it is recommended first to recognize what improvement in model performance would occur if all available records could be well exploited. This work proposes a novel machine learning framework with physical constraints capable of successfully imputing a high percentage of missing data belonging to several environmental domains (meteorology, water quantity, water quality), yielding satisfactory results. In particular, the minimum NSE computed for meteorologic variables is 0.72. For hydrometric variables, NSE is always >0.97. More than 78 % of the physical-water-quality variables is characterized by NSE > 0.45, and >66 % of the chemical-water quality variables reaches NSE > 0.35. This work's results demonstrate the proposed framework's effectiveness as a data augmentation tool to improve the performance of integrated environmental modeling.
environmental sciences
What problem does this paper attempt to address?