Research And Implementation Of Hot Topic Detection System Based On Web

Bing Zhu,Yang Yu,Chuanzhen Li,Hui Wang
DOI: https://doi.org/10.1109/compcomm.2017.8322791
2017-01-01
Abstract:To alleviate the problem of "topic drift" in hot topic on web, the hot topic detection system is implemented. Our system contains the parts as follow. First, news data are collected from Web and cut into the Chinese word segmentation; second, the word weight is calculated by the improved word frequency-inverse document frequency formula (TF-IDF); at last, the Single-Pass clustering algorithm is used for text clustering. The TF-IDF formula is improved, which contains the position information of the word. Experiments show that the hot topic can be effectively identified by this system.
What problem does this paper attempt to address?