Learning-based topic detection using multiple features

Zheng Hai-Tao,Wang Zhe,Wang Wei,Sangaiah Arun Kumar,Xiao Xi,Zhao Congzhi
DOI: https://doi.org/10.1002/cpe.4444
2018-01-01
Abstract:Recently, microblog sites such as Twitter attract a great deal of attention as an information resource for topic detection task. Most of existing feature-pivot topic detection algorithms in Twitter just take a single feature into account rather than multiple features. Thus, these methods always only detect the topics related to the single feature and miss some important topics, which causes a relatively low performance. In this paper, we build a flexible term representation framework for feature-pivot topic detection based on four features. A Learning-based Topic Detection using Multiple Features (LTDMF) method is proposed to improve the performance of topic detection. We define a correlation function based on a specific neural network to integrate various features. A Hierarchical Agglomerative Clustering (HAC) algorithm is applied to cluster terms as topics. Based on multiple features, LTDMF detects all types of topics and improves the accuracy of topic detection to solve the problem of missing topics. Experiments show that LTDMF gets a better performance compared with several baseline methods in terms of precision and recall.
What problem does this paper attempt to address?