Extreme-value modelling of migratory bird arrival dates: Insights from citizen science data

Jonathan Koh,Thomas Opitz
DOI: https://doi.org/10.48550/arXiv.2312.01870
2024-08-16
Abstract:Citizen science mobilises many observers and gathers huge datasets but often without strict sampling protocols, resulting in observation biases due to heterogeneous sampling effort, which can lead to biased statistical inferences. We develop a spatiotemporal Bayesian hierarchical model for bias-corrected estimation of arrival dates of the first migratory bird individuals at a breeding site. Higher sampling effort could be correlated with earlier observed dates. We implement data fusion of two citizen-science datasets with fundamentally different protocols (BBS, eBird) and map posterior distributions of the latent process, which contains four spatial components with Gaussian process priors: species niche; sampling effort; position and scale parameters of annual first arrival date. The data layer includes four response variables: counts of observed eBird locations (Poisson); presence-absence at observed eBird locations (Binomial); BBS occurrence counts (Poisson); first arrival dates (Generalised Extreme-Value). We devise a Markov Chain Monte Carlo scheme and check by simulation that the latent process components are identifiable. We apply our model to several migratory bird species in the northeastern US for 2001--2021 and find that the sampling effort significantly modulates the observed first arrival date. We exploit this relationship to effectively bias-correct predictions of the true first arrivals.
Methodology,Applications
What problem does this paper attempt to address?