News Topic Discovery Through Community Detection

Daqing Wu,Xiangyang Guo,Jinwen Ma
DOI: https://doi.org/10.1109/icsidp47821.2019.9173189
2019-01-01
Abstract:With the rapid development of communication and internet, there are a huge number of items of news every day. According to the characteristics of news dissemination, many pieces of news will focus on one topic about the same event or person. So, news topic discovery becomes a very important and urgent task in text mining. In fact, for news topic discovery, Latent Dirichlet Allocation (LDA) is the most frequently used model which considers each document being generated from a finite mixture of $K$ possible topics. However, the performance of LDA is not so satisfactory in practical applications. In this paper, we try to solve this problem through text structure mining. Our proposed method consists of two steps. The first step is to find out the topics as the clusters or communities of all the news items through the method of community detection, while the second step is to utilize the Bayesian unigram model to obtain the topic tokens for each topic. It is demonstrated by the experimental results that our proposed method can find out the topics much better than LDA on a real world news dataset.
What problem does this paper attempt to address?