Sampling eCommerce Data from the Web: Methodological and Practical Issues

Galit Shmueli,Wolfgang Jank,Ravi Bapna
2006-01-01
Abstract:Empirical research that is based on web-collected data has been rapidly growing thanks to the large amounts of freely available web-data and the tech- nological wonders of web spiders for grabbing data. This is especially true for electronic commerce re- search, which yields results that can be very in∞uen- tial on the market. Although all studies rely on in- ferences from the collected data to some population of interest, there has been nearly no attention paid to sampling issues. The methodology of statistical sampling is very relevant in web-data collection. It includes deflning observational units and target and sampled populations, determining sources of sam- pling and non-sampling errors, choosing appropri- ate sampling designs, and adjusting sample estima- tors to reduce bias and increase precision. Sampling eCommerce data shares many characteristics with other types of sampling (e.g. surveys), but also has special features that researchers should be aware of and account for. In this paper we discuss web-data, and in particular eCommerce data collection in the context of sampling methodology, and suggest im- provements to current practice in this modern sam- pling setting.
What problem does this paper attempt to address?