Deep autoregressive neural networks for high-dimensional inverse problems in groundwater contaminant source identification

Shaoxing Mo,Nicholas Zabaras,Xiaoqing Shi,Jichun Wu
DOI: https://doi.org/10.1029/2018WR024638
2018-12-22
Abstract:Identification of a groundwater contaminant source simultaneously with the hydraulic conductivity in highly-heterogeneous media often results in a high-dimensional inverse problem. In this study, a deep autoregressive neural network-based surrogate method is developed for the forward model to allow us to solve efficiently such high-dimensional inverse problems. The surrogate is trained using limited evaluations of the forward model. Since the relationship between the time-varying inputs and outputs of the forward transport model is complex, we propose an autoregressive strategy, which treats the output at the previous time step as input to the network for predicting the output at the current time step. We employ a dense convolutional encoder-decoder network architecture in which the high-dimensional input and output fields of the model are treated as images to leverage the robust capability of convolutional networks in image-like data processing. An iterative local updating ensemble smoother (ILUES) algorithm is used as the inversion framework. The proposed method is evaluated using a synthetic contaminant source identification problem with 686 uncertain input parameters. Results indicate that, with relatively limited training data, the deep autoregressive neural network consisting of 27 convolutional layers is capable of providing an accurate approximation for the high-dimensional model input-output relationship. The autoregressive strategy substantially improves the network's accuracy and computational efficiency. The application of the surrogate-based ILUES in solving the inverse problem shows that it can achieve accurate inversion results and predictive uncertainty estimates.
Machine Learning
What problem does this paper attempt to address?
The key problem that this paper attempts to solve is: in highly heterogeneous media, the problem of simultaneously identifying groundwater pollution sources (including location and release history) and estimating the hydraulic conductivity field. This problem usually leads to high - dimensional inverse problems with extremely high computational complexity and is difficult to solve directly. ### Specific description of the problem 1. **High - dimensional inverse problems**: - Identification of groundwater pollution sources and estimation of the hydraulic conductivity field are key steps in groundwater pollution prediction and treatment decision - making. - Due to the limited actual measurement data, it is usually necessary to indirectly identify these parameters through inverse problems. - In highly heterogeneous aquifers, the representation of the hydraulic conductivity field often requires a large number of random degrees of freedom, thus leading to high - dimensional inverse problems. 2. **Time - dependent input**: - The release history of pollution sources is usually time - dependent, which makes the problem more complex. - A method capable of effectively processing time - series data is required to capture this time - dependence. 3. **Computational efficiency**: - Solving high - dimensional inverse problems usually requires a large number of forward model evaluations, and the computational cost is very high. - An efficient surrogate model needs to be developed to reduce the computational burden. ### Solutions proposed in the paper To solve the above problems, the paper proposes an alternative method based on deep autoregressive neural networks (DARNs), combined with the iterative local update ensemble smoother (ILUES) algorithm to efficiently solve high - dimensional inverse problems. Specifically: - **Deep autoregressive neural networks**: - Using the autoregressive strategy, the output of the previous time step is used as the input of the current time step to improve the prediction accuracy for time - dependent problems. - Adopting a dense convolutional encoder - decoder network architecture, the high - dimensional input - output relationship is transformed into an image - to - image regression problem, taking advantage of the powerful capabilities of convolutional networks in image processing. - **ILUES algorithm**: - ILUES is an effective framework for solving high - dimensional nonlinear inverse problems, which can reduce the amount of computation while ensuring accuracy. - By introducing a surrogate model, the computational efficiency of ILUES is further improved. ### Results and verification The paper is verified through a synthetic pollution source identification problem with 686 uncertain input parameters. The results show that: - The proposed deep autoregressive neural network can provide an accurate approximation of the high - dimensional model input - output relationship under relatively limited training data. - The autoregressive strategy significantly improves the accuracy and computational efficiency of the network. - The ILUES algorithm using a surrogate model can achieve accurate inversion results and provide reliable uncertainty estimates. In summary, this paper aims to solve the high - dimensional inverse problem in groundwater pollution source identification by combining deep learning and traditional data assimilation methods, and improve computational efficiency and prediction accuracy.