Wisdom of Fusion: Prediction of 2016 Taiwan Election with Heterogeneous Big Data
Zheng Xie,Guannan Liu,Junjie Wu,Lihong Wang,Chunyang Liu
DOI: https://doi.org/10.1109/icsssm.2016.7538625
2016-01-01
Abstract:Using social media for political discourse has received much attention due to its real-time and interactive nature, especially around election time. Recent studies have explored the power of a single social media platform, such as Google or twitter, on recording current social trends and predicting the voting outcomes in a particular region. These pilot studies, though being very interesting, fail to integrate more of the heterogeneous information available online, nor do they consider the demographical bias of online users most of whom are young people. In this work, by aggregating online data from social media and offline data from pollsters, we achieve accurate prediction of candidates' votes in 2016 Taiwan presidential election, with error rates ranging from 0.30% to 2.85%. Our main contributions are summarized as follows. First, to our best knowledge, we are among the earliest studies to fuse heterogeneous information for election prediction. Three types of online information as signals of online public opinion are obtained from social networking sites (e.g. Facebook and Twitter), search engines (e.g. Google), and campaign homepages. To avoid voter bias, we further introduce offline demographical information to weight online and offline voting for a final prediction. Second, by taking election prediction as an unsupervised sequential prediction task, we introduce Kalman filter, a widely used signal processing method, to automatically select reliable information sources and fuse them for daily prediction. Finally, by taking into account the sensitivity of tweet volumes on Twitter, the Moving Average model is applied for real-time burst detection. Our work provides unique values to identifying important online information sources as well as their valid periods for election prediction, and shows great potentials for event influence analytics.