A comparative study among machine learning and numerical models for simulating groundwater dynamics in the Heihe River Basin, northwestern China

Chong Chen,Wei He,Han Zhou,Yaru Xue,Mingda Zhu
DOI: https://doi.org/10.1038/s41598-020-60698-9
IF: 4.6
2020-03-03
Scientific Reports
Abstract:Abstract Groundwater is unique resource for agriculture, domestic use, industry and environment in the Heihe River Basin, northwestern China. Numerical models are effective approaches to simulate and analyze the groundwater dynamics under changeable conditions and have been widely used all over the world. In this paper, the groundwater dynamics of the middle reaches of the Heihe River Basin was simulated using one numerical model and three machine learning algorithms (multi-layer perceptron (MLP); radial basis function network (RBF); support vector machine (SVM)). Historical groundwater levels and streamflow rates were used to calibrate/train and verify the different methods. The root mean square error and R 2 were used to evaluate the accuracy of the simulation/training and verification results. The results showed that the accuracy of machine learning models was significantly better than that of numerical model in both stages. The SVM and RBF performed the best in training and verification stages, respectively. However, it should be noted that the generalization ability of numerical model is superior to the machine learning models because of the inclusion of physical mechanism. This study provides a feasible and accurate approach for simulating groundwater dynamics and a reference for model selection.
multidisciplinary sciences
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to use numerical models and three machine - learning algorithms (Multi - Layer Perceptron (MLP), Radial Basis Function Network (RBF), and Support Vector Machine (SVM)) to simulate groundwater dynamics in the middle reaches of the Heihe River Basin in northwest China. Specifically, the paper aims to: 1. **Explore the effectiveness of machine - learning methods in simulating groundwater dynamics in arid basins**: By comparing machine - learning methods with traditional numerical models, evaluate the performance of these methods in simulating groundwater dynamics. 2. **Discuss the applicability of machine - learning methods and numerical models**: By comparing the simulation results of different methods, analyze their advantages and disadvantages in groundwater dynamics simulation, and provide references for model selection. ### Background of the Paper Groundwater, as an important resource for agriculture, households, industry, and the environment, is of great significance in the Heihe River Basin in northwest China. Numerical models are effective tools for simulating and analyzing groundwater dynamics, but they require a large amount of accurate data, and computational resources are difficult to meet the increasingly complex model requirements. In recent years, machine - learning methods have been widely applied in the field of hydrological research because of their ability to handle complex data patterns. ### Research Methods - **Numerical Model**: Use MODFLOW for simulation. This model solves the three - dimensional groundwater flow equation based on the finite - difference method. - **Machine - Learning Models**: - **Multi - Layer Perceptron (MLP)**: A feed - forward artificial neural network, including an input layer, a hidden layer, and an output layer. - **Radial Basis Function Network (RBF)**: Use the radial basis function as the activation function of the hidden layer. - **Support Vector Machine (SVM)**: Based on statistical learning theory, use a nonlinear kernel function to map the input to a high - dimensional feature space. ### Data and Processing - **Data Sources**: Include Digital Elevation Model (DEM), land - use data, groundwater pumping rate, groundwater level, river flow, etc. - **Data Processing**: Time - series data are converted into monthly stress periods from January 1986 to December 2010. The data are divided into a calibration/training period (1986 - 2008) and a validation period (2009 - 2010). ### Model Performance Evaluation - **Evaluation Metrics**: Use Root Mean Square Error (RMSE) and Coefficient of Determination (R²) to evaluate the simulation accuracy of the model. - **Generalization Ability**: Evaluate the generalization ability of the model by comparing the RMSE values in the prediction stage and the training stage. ### Results and Discussion - **Calibration/Training Stage**: The numerical model performs reasonably during the calibration period, with an RMSE value of 5.61 meters and an R² value of 0.52. The RMSE values of the machine - learning models (MLP, RBF, SVM) are 0.99 meters, 0.84 meters, and 0.83 meters respectively, and the R² values are 0.71, 0.75, and 0.76 respectively. The SVM model performs slightly better than the other two models. - **Validation Stage**: The performance of the numerical model and the machine - learning models during the validation period is also evaluated. The results show that the machine - learning models perform better than the numerical model in the prediction stage. ### Conclusions - **Machine - Learning Models** show higher accuracy in simulating groundwater dynamics, especially in the training and validation stages. - **Numerical Models** Although their performance in the prediction stage is not as good as that of machine - learning models, they have an advantage in generalization ability because they contain physical mechanisms. In general, this study provides feasible and accurate methods for simulating groundwater dynamics in arid areas and provides references for model selection.