An Improved Latent Dirichlet Allocation Method For Service Topic Detection

Lantian Guo,Zhe Li,Tao Yang,Huixiang Zhang,Dejun Mu,Yang Li
DOI: https://doi.org/10.1109/ChiCC.2016.7554469
2016-01-01
Abstract:Service topic detection is one of the most important techniques in service information extraction, clustering and recommendation. Comparing with short text corpus in social network, service description corpus possesses higher dimensionality and more diversity. It is difficult to detect topics from a large number of service descriptions. To address these challenges, we proposed a new LDA (Latent Dirichlet Allocation) model based topic detection method, referred to as CV- LDA (Context sensitive word Vector based LDA). It utilizes a word embedding based method that generate context sensitive vector to cluster the words for decreasing dimensionality. Through topic perplexity analysis in the real- world dataset, it is obvious that topics detected by our method has a lower perplexity, comparing with word frequency weighing based vectors.
What problem does this paper attempt to address?