Social media for polling and predicting United States election outcome

Brian Heredia,Joseph D. Prusa,Taghi M. Khoshgoftaar
DOI: https://doi.org/10.1007/s13278-018-0525-y
2018-07-17
Social Network Analysis and Mining
Abstract:Twitter has been in the forefront of political discourse, with politicians choosing it as their platform for disseminating information to their constituents. We seek to explore the effectiveness of social media as a resource for both polling and predicting the election outcome. To this aim, we create a dataset consisting of approximately 3 million tweets ranging from September 22nd to November 8th, 2016. Polling analysis will be performed on two levels: national and state. Predicting the election is performed only at the state level due to the electoral college process present in the U.S. election system. Two approaches are used for predicting the election, a winner-take-all approach and shared elector count approach. Twenty-one states are chosen, eleven categorized as swing states, and ten as heavily favored states. Two metrics are incorporated for polling and predicting the election outcome: tweet volume per candidate and positive sentiment per candidate. Our approach shows when polling on the national level, aggregated sentiment across the election time period provides values close to the polls. At the state level, volume is not a good candidate for polling state votes. Sentiment produces values closer to swing state polls when the election is close.
What problem does this paper attempt to address?