A Novel Hot Topic Detection Framework With Integration of Image and Short Text Information From Twitter
Chengde Zhang,Shaozhen Lu,Chengming Zhang,Xia Xiao,Qian Wang,Gao Chen
DOI: https://doi.org/10.1109/access.2018.2886366
IF: 3.9
2019-01-01
IEEE Access
Abstract:Twitter exhibits several characteristics, including a limited number of features and noisy text information. Extracting valuable information from Twitter has made hot topic detection a challenging task. In this paper, a novel four-stage framework is proposed to improve the performance of topic detection. Data preprocessing is the first stage. Deep learning is then exploited to enrich short text information via image understanding. Next, improved latent Dirichlet allocation is used to optimize the image effective word pairs, which improves the accuracy of the extracted topic words. Finally, both short text and images are integrated for topic detection, in which the corresponding topics are mined based on fuzzy matching of topic words. A large number of experiments show that the proposed framework significantly improves the performance of topic detection and outperforms the selected baseline methods.
computer science, information systems,telecommunications,engineering, electrical & electronic