Causal prior-embedded physics-informed neural networks and a case study on metformin transport in porous media

Qiao Kang,Baiyu Zhang,Yiqi Cao,Xing Song,Xudong Ye,Xixi Li,Hongjing Wu,Yuanzhu Chen,Bing Chen
DOI: https://doi.org/10.1016/j.watres.2024.121985
2024-09-01
Abstract:This study introduces a novel approach to transport modelling by integrating experimentally derived causal priors into neural networks. We illustrate this paradigm using a case study of metformin, a ubiquitous pharmaceutical emerging pollutant, and its transport behaviour in sandy media. Specifically, data from metformin's sandy column transport experiment was used to estimate unobservable parameters through a physics-based model Hydrus-1D, followed by a data augmentation to produce a more comprehensive dataset. A causal graph incorporating key variables was constructed, aiding in identifying impactful variables and estimating their causal dynamics or "causal prior." The causal priors extracted from the augmented dataset included underexplored system parameters such as the type-1 sorption fraction F, first-order reaction rate coefficient α, and transport system scale. Their moderate impact on the transport process has been quantitatively evaluated (normalized causal effect 0.0423, -0.1447 and -0.0351, respectively) with adequate confounders considered for the first time. The prior was later embedded into multilayer neural networks via two methods: causal weight initialization and causal prior regularization. Based on the results from AutoML hyperparameter tuning experiments, using two embedding methods simultaneously emerged as a more advantageous practice since our proposed causal weight initialization technique can enhance model stability, particularly when used in conjunction with causal prior regularization. amongst those experiments utilizing both techniques, the R-squared values peaked at 0.881. This study demonstrates a balanced approach between expert knowledge and data-driven methods, providing enhanced interpretability in black-box models such as neural networks for environmental modelling.
What problem does this paper attempt to address?