Detecting Hot Topics in Technology News Streams

Bo You,Ming Liu,Bingquan Liu,Xiaolong Wang
DOI: https://doi.org/10.1109/icmlc.2012.6359678
2012-01-01
Abstract:Detecting hot topics with a fine granularity in technology news streams is an interesting and important problem given the large amount of reports and a relatively narrow range of topics. In this paper, a three-phase method is proposed. In the first phase, the document topic distribution vector is generated and keywords are extracted for each document using topic model pachinko allocation. In the second phase, the documents are clustered based on the document topic distribution vector obtained from the previous phase using affinity propagation. And in the last phase, actual events denoted by combinations of keywords within each cluster are found out using frequent pattern mining algorithms. We evaluate our approach on a collection of technology news reports from various sites in a fixed time period. T he results show that this method is effective.
What problem does this paper attempt to address?