Topic Detection in Twitter Based on Label Propagation Model

Dongxu Huang,Dejun Mu
DOI: https://doi.org/10.1109/DCABES.2014.23
2014-01-01
Abstract:Many kinds of huge amount of tweets about real-world events are generated everyday in Twitter. However, the disorganization messages required to be classified by topics and events are one of challenges to get knowledge effectively. To solve the problem, we propose a novel method that combines the cluster algorithm with label propagation algorithm to detect topics in twitter. First, we use canopy cluster algorithm to cluster tweets, canopy cluster algorithm could divides a tweet into different clusters, and the tweet which only belongs to one cluster will be labeled. Second, the mechanism of label propagation is used to label the tweets that in the overlapping of different clusters. In order to evaluate our algorithm, we use two baseline algorithms, LDA (Latent Dirichlet Allocation) and Single-Pass cluster algorithm. We apply three algorithms on tweet dataset with three topics and some noisy data, and experiment results show our method outperforms other algorithms on precision and recall rate.
What problem does this paper attempt to address?