A comprehensive approach to analyzing environmental data with non-detects

Benjamin F. Trueman,Madison Gouthro,Amina K. Stoddart,Graham A. Gagnon
DOI: https://doi.org/10.26434/chemrxiv-2024-m2h4h
2024-09-05
Abstract:Non-detects—measurements reported as “below the detection limit”—are ubiquitous in environmental science and engineering. They are frequently replaced with a constant, but this biases estimates of means, regression slopes, and correlation coefficients. Omitting non-detects is worse, and has led to serious errors. Simple alternatives are available: rank-based statistics, maximum likelihood estimation, and re-purposed survival analysis routines. But many environmental datasets do not align well with the assumptions these methods make—it is often necessary to account for hierarchy (e.g., measurements nested within lakes), sampling strategy (e.g., measurements collected as time series), heterogeneity (e.g., site-dependent variance), and measurement error. Bayesian methods offer the flexibility to do this; incorporating non-detects is also easy and does not bias model parameter estimates as substitution does. Here we discuss Bayesian implementations of common bivariate and multivariate statistical methods relevant to environmental science. We use a dataset comprising time series of Ag, As, Cd, Ce, Co, Sb, Ti, U, and V concentrations in municipal biosolids that includes many non-detects. The models can be reproduced and extended to new problems using the data and code accompanying this paper.
Chemistry
What problem does this paper attempt to address?