Using Double Machine Learning to Understand Nonresponse in the Recruitment of a Mixed-Mode Online Panel
Barbara Felderer,Jannis Kueck,Martin Spindler
DOI: https://doi.org/10.1177/08944393221095194
2022-06-10
Social Science Computer Review
Abstract:Social Science Computer Review, Ahead of Print. Survey scientists increasingly face the problem of high-dimensionality in their research as digitization makes it much easier to construct high-dimensional (or "big") data sets through tools such as online surveys and mobile applications. Machine learning methods are able to handle such data, and they have been successfully applied to solve predictive problems. However, in many situations, survey statisticians want to learn about causal relationships to draw conclusions and be able to transfer the findings of one survey to another. Standard machine learning methods provide biased estimates of such relationships. We introduce into survey statistics the double machine learning approach, which gives approximately unbiased estimators of parameters of interest, and show how it can be used to analyze survey nonresponse in a high-dimensional panel setting. The double machine learning approach here assumes unconfoundedness of variables as its identification strategy. In high-dimensional settings, where the number of potential confounders to include in the model is too large, the double machine learning approach secures valid inference by selecting the relevant confounding variables.
social sciences, interdisciplinary,computer science, interdisciplinary applications,information science & library science