A User Modeling Pipeline for Studying Polarized Political Events in Social Media

Roberto Napoli,Ali Mert Ertugrul,Alessandro Bozzon,Marco Brambilla
DOI: https://doi.org/10.48550/arXiv.1807.09459
2018-07-25
Abstract:This paper presents a user modeling pipeline to analyze discussions and opinions shared on social media regarding polarized political events (e.g., public polls). The pipeline follows a four-step methodology. First, social media posts and users metadata are crawled. Second, a filtering mechanism is applied to filter spammers and bot users. As a third step, demographics information is extracted out of the valid users, namely gender, age, ethnicity and location information. Finally, the political polarity of the users with respect to the analyzed event is predicted. In the scope of this work, our proposed pipeline is applied to two referendum scenarios (independence of Catalonia in Spain and autonomy of Lombardy in Italy) in order to assess the performance of the approach with respect to the capability of collecting correct insights on the demographics of social media users and of predicting the poll results based on the opinions shared by the users. Experiments show that the method was effective in predicting the political trends for the Catalonia case, but not for the Lombardy case. Among the various motivations for this, we noticed that in general Twitter was more representative of the users opposing the referendum than the ones in favor.
Social and Information Networks,Computers and Society
What problem does this paper attempt to address?
The paper attempts to address the issue of how to systematically study discussions and opinions on social media regarding polarized political events (such as referendums). Specifically, the paper proposes a user modeling pipeline to analyze social media data through the following four steps: 1. **Data Collection**: Scraping posts and user metadata from social media. 2. **Filtering Mechanism**: Applying filtering mechanisms to remove spammers and bot accounts. 3. **Demographic Analysis**: Extracting demographic information from valid users, including gender, age, race, and geographic location. 4. **Political Inclination Prediction**: Predicting users' political inclinations in the analyzed political event. The paper evaluates the effectiveness of this method in collecting accurate demographic data and predicting voting outcomes based on users' shared opinions by applying the pipeline to two real-world referendum cases (the Catalonia independence referendum in Spain and the Lombardy autonomy referendum in Italy). The experimental results show that the method is effective in predicting political trends for the Catalonia referendum but performs poorly for the Lombardy referendum. The study also finds that, overall, users on Twitter are more representative of the opposition to the referendum rather than the support side.