On Two Existing Approaches to Statistical Analysis of Social Media Data

Martina Patone,Li‐Chun Zhang
DOI: https://doi.org/10.1111/insr.12404
2020-08-26
International Statistical Review
Abstract:<p>Using social media data for statistical analysis of general population faces commonly two basic obstacles: firstly, social media data are collected for different objects than the population units of interest; secondly, the relevant measures are typically not available directly but need to be extracted by algorithms or machine learning techniques. In this paper, we examine and summarise two existing approaches to statistical analysis based on social media data, which can be discerned in the literature. In the first approach, analysis is applied to the social media data that are organised around the objects directly observed in the data; in the second one, a different analysis is applied to a constructed pseudo survey dataset, aimed to transform the observed social media data to a set of units from the target population. We elaborate systematically the relevant data quality frameworks, exemplify their applications and highlight some typical challenges associated with social media data.</p>
statistics & probability
What problem does this paper attempt to address?