CS-BTM: a semantics-based hot topic detection method for social network
Weinan Niu,Wenan Tan,Wei Jia
DOI: https://doi.org/10.1007/s10489-022-03500-9
IF: 5.3
2022-04-10
Applied Intelligence
Abstract:Social network mediums play a significant role in our daily life, which has produced enormous documents. Detecting the topics of documents can help users quickly find the interested documents. Consequently, topic detection for Microblog gets progressively attention. However, it is an extraordinary task for the following two major challenges. First, in the aspect of topic extraction, for the short text and sparsity of the Microblog, most of the existing algorithms dealing with the long text message can not deal it well. Second, most of the traditional text semantics processing models have not considered the situation that context semantics and polysemy, which may lead to inaccurate results. To address the above two challenges, a new model is proposed to improve the BTM (Biterm Topic Model) model, called CS-BTM (Context Semantics-based Biterm Topic Model). The CS-BTM model mines the similar biterms in context semantics by Bert model when count the occurrence times of biterms, which is contributed to detect the topic words of each topic. Moreover, by optimizing the Single-pass clustering algorithm, we propose a new algorithm to cluster the topics obtained by CS-BTM for topic detection. Through experimental verification, the proposed method serves an important role in topic detection compared with the several state-of-the-art methods.
computer science, artificial intelligence