Simultaneous Enhance Text Clustering and Annotation Based on Topic Model and Random Walks

Jiashen Sun,Xiaojie Wang,Caixia Yuan
DOI: https://doi.org/10.1109/icciautom.2011.6183893
2011-01-01
Abstract:Web page clustering and annotation promise improved search and browsing on the web, which has received significant attention in the past, however, most approaches have been targeted at only one of the two issues. In this paper, we address text clustering and annotation as a joint problem and show how the two enhance each other. We first present a topic model, via which we construct an association graph including tags, document and topics, then we perform random walks over the graph and achieve clustering and annotation simultaneously. We examine the performance of our model on a real-world data sampled from del.icio.us and ODP, illustrating that our model provides improved annotation and clustering performance over two strong baseline models.
What problem does this paper attempt to address?