A Bayesian Hidden Semi-Markov Model with Covariate-Dependent State Duration Parameters for High-Frequency Environmental Data

Shirley Rojas-Salazar,Erin M. Schliep,Christopher K. Wikle,Emily H. Stanley,Stephen R. Carpenter,Noah R. Lottig
DOI: https://doi.org/10.48550/arXiv.2109.09949
2021-09-21
Abstract:Environmental time series data observed at high frequencies can be studied with approaches such as hidden Markov and semi-Markov models (HMM and HSMM). HSMMs extend the HMM by explicitly modeling the time spent in each state. In a discrete-time HSMM, the duration in each state can be modeled with a zero-truncated Poisson distribution, where the duration parameter may be state-specific but constant in time. We extend the HSMM by allowing the state-specific duration parameters to vary in time and model them as a function of known covariates observed over a period of time leading up to a state transition. In addition, we propose a data subsampling approach given that high-frequency data can violate the conditional independence assumption of the HSMM. We apply the model to high-frequency data collected by an instrumented buoy in Lake Mendota. We model the phycocyanin concentration, which is used in aquatic systems to estimate the relative abundance of blue-green algae, and identify important time-varying effects associated with the duration in each state.
Applications
What problem does this paper attempt to address?