Finding the True Crowds: User Filtering in Microblogs

Bin Hao, Min Zhang, Weizhi Ma, Jiashen Sun, Yiqun Liu, Shaoping Ma, Xuan Zhu, Hengliang Luo
DOI: https://doi.org/10.1007/978-3-319-50496-4_50
2016-01-01
Abstract:Nowadays users like to share their opinions towards a product/service or policy in social media, which is important to the manufacturers and governments to collect feedbacks from the crowds. While in microblogs, information is highly unbalanced that lots of posts are published and spread by ghost-writers/spammers, sellers, official accounts, etc., but information provided by the true crowds is overwhelmed frequently. Previous studies mostly concern on how to find one specific type of users; but do not investigate how to filter multiple types of specific users so as to keep only the true crowds, which is the main topic of this work. In this paper, we first show the categorization on four different types of users, namely ghost-writers, sellers, official accounts and end-users (the former three are noted as a broad sense advertisers in the paper), and study their characteristics. Then we propose a Topic-Specific Divergence based model to filter out advertisers so that end-users can be kept. Meta-information, content are investigated in comparative analysis. Encouraging experimental results on real dataset clearly verify that the proposed approach outperforms the state-of-art methods significantly.
What problem does this paper attempt to address?