STMLRC: Sparse Topic Model with Low Rank Constraint

Chao LIU,Lian-sheng ZHUANG,Neng-hai YU
2014-01-01
Computer Science
Abstract:The project matrix learned by classic Latent Semantic Analysis is always dense, which leads to high storage cost and unclear semantic for each topic. To tackle this problem, a novel sparse topic model was proposed in this paper. By enforcing the sparsity of project matrix, the new model only selects a small number of relevant words for each topic and hence leads to a clear semantic interpretation. Moreover, by enforcing the low rankness of encoding matrix, data projected in the topic subspace shows a better clustering features. Experimental result show that topic subspace learned by our new topic model is in favor of classification, and significantly reduces the storage cost of project matrix.
What problem does this paper attempt to address?