Abstract:The ecosystem of semi-arid watersheds is influenced by a combination of natural climate factors, rainfall, and habitat destruction, resulting in complex mechanisms of spatial differentiation and evolution of water ecological health. Indicator selection in mainstream water ecological health assessment methods, such as the Index of Biotic Integrity (IBI), often relies on subjective reference point choices. This approach tends to overlook the comprehensive impacts and interactions among various environmental stressors. For watersheds significantly influenced by natural climatic factors, considerable uncertainties arise, leading to a lack of scientific justification for establishing water ecological health protection goals. In this study, the nonlinear capabilities of the random forest (RF) model were applied to reduce subjectivity in traditional water ecological health assessments and to more accurately reveal the emerging spatial differentiation patterns and underlying causes of water ecological health in the Wei River Basin (WRB), the largest typical semi-arid watershed of the Yellow River in China. Our findings indicate: (1) Traditional evaluation indices indicate that the overall water ecological health of the WRB is classified as sub-healthy (60 %). The core indicators include dominant species, total algal density, and the percentage of diatom density, with no significant spatial differentiation observed. (2) An improved water ecological health assessment method for semi-arid watersheds, based on the RF model, has been developed to replace traditional subjective judgment steps. This method establishes a complex multi-input–output response relationship (R2>0.85) between environmental stress indicators and the biological integrity index for the WRB. (3) The model results identify key driving factors affecting changes in water ecological health in semi-arid watersheds, with the sensitivity of the new model increasing nearly 11-fold compared to traditional IBI methods. (4) Following improvements, the water ecological health characteristics of the WRB exhibit significant spatial heterogeneity, with a higher dispersion coefficient (1.21), and demonstrate enhanced nonlinear response trends to climatic factors. The application of machine learning models indicates that traditional methods may underestimate the extent of ecological health degradation in watersheds and tend to oversimplify spatial heterogeneity characteristics.

Potential of Domain Adaptation in Machine Learning in Ecology and Hydrology to Improve Model Extrapolability

Extrapolability Improvement of Machine Learning-Based Evapotranspiration Models via Domain-Adversarial Neural Networks

Approaches for enhancing extrapolability in process-based and data-driven models in hydrology

Hierarchical Domain Adaptation with Local Feature Patterns

Super-model ecosystem: A domain-adaptation perspective

A New Benchmark on Machine Learning Methodologies for Hydrological Processes Modelling: A Comprehensive Review for Limitations and Future Research Directions

Advancing Streamflow Prediction in Data-Scarce Regions Through Vegetation-Constrained Distributed Hybrid Ecohydrological Models

Domain adaptation in small-scale and heterogeneous biological datasets

Get Your Embedding Space in Order: Domain-Adaptive Regression for Forest Monitoring

Machine Learning Applications in Hydrology

Predictive habitat distribution models in ecology

A Test of Species Distribution Model Transferability Across Environmental and Geographic Space for 108 Western North American Tree Species

Cross-Domain Foundation Model Adaptation: Pioneering Computer Vision Models for Geophysical Data Analysis

Comparing the use of all data or specific subsets for training machine learning models in hydrology: A case study of evapotranspiration prediction

Overview of Ecohydrological Models and Systems at the Watershed Scale

Beyond prediction: An integrated post-hoc approach to interpret complex model in hydrometeorology

Identifying and characterizing extrapolation in multivariate response data

Methods to improve run time of hydrologic models: opportunities and challenges in the machine learning era

Robustness, Evaluation and Adaptation of Machine Learning Models in the Wild

Regionalization in a global hydrologic deep learning model: From physical descriptors to random vectors

Spatial Patterns of Hydroecological Health in the Semi-Arid Yellow River Basin: Revelations from Machine Learning Models