Topic detection technique based on the vector center model

Yongping Du
2009-01-01
Abstract:Topic detection recognizes the new topic in a stream of news stories. This task becomes the hotspot research direction in the field of natural language processing in recent years. In this paper, the effective features in the story document are extracted, and the vector centered model is taken to represent the topic and the story. The algorithm of incremental clustering is carried out for topic detection, which will identify the emergence of new events and also merge the story to the corresponding topic cluster. Finally, we give the statistical data analysis and achieve the performance of 75% F value on the test set. Topic detection will also provide the efficient guidance for judging the hot spots in the web.
What problem does this paper attempt to address?