Mining Hot Phrases on SociaI Network Text Streams Based on AC-Trie

Jiu-ming HUANG,Quan-yuan WU,Sheng-dong ZHANG,Yan JIA,Dong LIU,Bin ZHOU
DOI: https://doi.org/10.3969/j.issn.0372-2112.2016.10.026
2016-01-01
Abstract:The hot phrases in the social network text streams can reflect the hidden hot topics and sudden events.This paper proposes a hot phrase mining technology which can support various hot degree measures without word segmentation. We first construct an AC-Trie using the candidate phrases gathered from text streams.Based on such AC-Trie,we record the historical occurrence frequency of phrases on the Trie by scanning the following streams in single-pass.Furthermore,the AC-Trie needs to be reconstructed using the new samples in the text stream because of the evolution of hot phrases.Thus,we start the reconstruction dynamically according to estimating the occurrence frequency of the missed phrases.The experiments on the Sina micro-blog show that our approach is effective (precision of 89%)and efficient (overhead is 2%of na?ve ap-proach).
What problem does this paper attempt to address?