Estimating Spatial Variation in Disease Risk from Locations Coarsened by Incomplete Geocoding

Dale L. Zimmerman,Xiangming Fang
DOI: https://doi.org/10.1016/j.stamet.2011.01.008
2011-01-01
Statistical Methodology
Abstract:Inference for spatial variation in relative risk of disease is an important problem in spatial epidemiologic studies. A standard component of data assimilation in these studies is the assignment of a geocode, i.e. point-level spatial coordinates, to the address of each subject in the study population. Unfortunately, when geococling is performed by the standard procedure of street-segment matching to a georeferenced road file and subsequent interpolation, it is rarely completely successful. Typically, 10-30% of the addresses in the study population fail to geocode, which can adversely affect relative risk estimation, especially if one of the disease groups (e.g. cases) has a different geocoding success rate than another (e.g. controls). The possibility exists, however, for ameliorating this effect by incorporating geographic information coarser than a point (e.g. a Zip code) that is measured for the observations that fail to geocode. This article develops coarsened-data methods for relative risk estimation from incompletely geocoded data. Nonparametric (kernel smoothing) estimation procedures are featured; parametric (likelihood-based) procedures are described as well, but their applicability is much more limited. We demonstrate, via simulation and a real example of childhood asthma cases in an Iowa county that substantial improvements in the quality of relative risk estimates are possible using the proposed nonparametric coarsened-data methods. (C) 2011 Elsevier B.V. All rights reserved.
What problem does this paper attempt to address?