On application of a response propensity model to estimation from web samples

Vladislav Beresovsky
DOI: https://doi.org/10.48550/arXiv.1906.08444
2019-06-20
Methodology
Abstract:Increasing nonresponse rates and the cost of data collection are two pressing problems encountered in traditional probability surveys. The proliferation of inexpensive data from web surveys stimulates interest in statistical techniques for valid inferences from web samples. We consider estimation of population and domain means in the two-sample setup, where the web sample contains variables of interest and covariates that are shared with an auxiliary probability survey sample. First, we propose an estimator of population mean, based on the estimated propensity of response to a web survey. This makes inferences from web samples that are similar to well-established techniques used for observational studies and missing data problems. Second, we propose an 'implicit' logistic regression for estimating parameters of the web response model in the two-sample setup. Implicit logistic regression uses selection probabilities, nominally defined for web sample units, and the size of the hypothetic population of responders to a web survey. A simulation study confirms the validity of implicit logistic regression and its higher efficiency comparing to alternative estimators of web response propensity.
What problem does this paper attempt to address?