Bayesian variable selection for spatially dependent generalized linear models

Kristian Lum
DOI: https://doi.org/10.48550/arXiv.1209.0661
2012-09-04
Abstract:Despite the abundance of methods for variable selection and accommodating spatial structure in regression models, there is little precedent for incorporating spatial dependence in covariate inclusion probabilities for regionally varying regression models. The lone existing approach is limited by difficult computation and the requirement that the spatial dependence be represented on a lattice, making this method inappropriate for areal models with irregular structures that often arise in ecology, epidemiology, and the social sciences. Here we present a novel method for spatial variable selection in areal generalized linear models that can accommodate arbitrary spatial structures and works with a broad subset of GLM likelihoods. The method uses a latent probit model with a spatial dependence structure where the binary response is taken as a covariate inclusion indicator for area-specific GLMs. The covariate inclusion indicators arise via thresholding of latent standard normals on which we place a conditionally autoregressive prior. We propose an efficient MCMC algorithm for computation that is entirely conjugate in any model with a conditionally Gaussian representation of the likelihood, thereby encompassing logistic, probit, multinomial probit and logit, Gaussian, and negative binomial regressions through the use of existing data augmentation methods. We demonstrate superior parameter recovery and prediction in simulation studies as well as in applications to geographic voting patterns and population estimation. Though the method is very broadly applicable, we note in particular that prior to this work, spatial population estimation/capture-recapture models allowing for varying list dependence structures has not been possible.
Methodology
What problem does this paper attempt to address?