Comparing machine learning and inverse modeling approaches for the source term estimation

Stefano Alessandrini,Scott Meech,Will Cheng,Christopher Rozoff,Rajesh Kumar
DOI: https://doi.org/10.1007/s11869-024-01570-x
2024-04-26
Air Quality Atmosphere & Health
Abstract:Mathematical models serve as crucial tools for quantitatively assessing the environmental and population impact resulting from the release of hazardous substances. Often, precise source parameters remain elusive, leading to a reliance on rudimentary assumptions. This challenge is particularly pronounced in scenarios involving releases that are accidental or deliberate acts of terrorism. A conventional method for estimating the source term involves the construction of backward plumes originating from various sensors measuring tracer concentrations. The area displaying the highest overlap of these backward plumes typically offers an initial approximation for the most probable release location. The backward plume (BP) method has been compared with a machine learning based method. Both methods use data from a field campaign and from a synthetic dataset built from a simple setup featuring receptors arranged linearly downwind from the release point. A substantial number (~ 1500) of forward plume simulations are conducted, each initiated from random locations and under varying meteorological conditions. This extensive dataset encompasses critical meteorological variables and concentration measurements recorded by idealized receptors. Subsequently, the dataset has been partitioned into training and testing subsets. A feed-forward neural network (NN) has been employed. This NN is trained using the concentration data from the receptors and the associated meteorological variables as input, with the source location coordinates serving as the output. Subsequent verification is carried out using the testing dataset, facilitating a comparison between the NN's and BP's predictions and the actual source locations. One of the key advantages of the NN-based approach is its ability to rapidly estimate the source term, typically within a fraction of a second on a standard laptop. This speed is of paramount significance in scenarios involving accidental releases, where swift response is essential. Notably, the computationally intensive tasks of dataset construction and NN training can be conducted offline, providing preparedness in areas where accidental releases may be anticipated.
environmental sciences
What problem does this paper attempt to address?