Spatial Variable Selection and An Application to Virginia Lyme Disease Emergence

Yimeng Xie,Li Xu,Jie Li,Xinwei Deng,Yili Hong,Korine Kolivras,David N. Gaines
DOI: https://doi.org/10.1080/01621459.2018.1564670
IF: 4.369
2019-04-11
Journal of the American Statistical Association
Abstract:Lyme disease is an infectious disease, that is, caused by a bacterium called Borrelia burgdorferi sensu stricto. In the United States, Lyme disease is one of the most common infectious diseases. The major endemic areas of the disease are New England, Mid-Atlantic, East-North Central, South Atlantic, and West North-Central. Virginia is on the front-line of the disease’s diffusion from the northeast to the south. One of the research objectives for the infectious disease community is to identify environmental and economic variables that are associated with the emergence of Lyme disease. In this article, we use a spatial Poisson regression model to link the spatial disease counts and environmental and economic variables, and develop a spatial variable selection procedure to effectively identify important factors by using an adaptive elastic net penalty. The proposed methods can automatically select important covariates, while adjusting for possible spatial correlations of disease counts. The performance of the proposed method is studied and compared with existing methods via a comprehensive simulation study. We apply the developed variable selection methods to the Virginia Lyme disease data and identify important variables that are new to the literature. Supplementary materials for this article, including a standardized description of the materials available for reproducing the work, are available as an online supplement.
statistics & probability
What problem does this paper attempt to address?