An Improved Regularized Latent Semantic Indexing with L1/2 Regularization and Non-negative Constraints

Yong Chen,Hui Zhang,Yuan Zuo,Deqing Wang
DOI: https://doi.org/10.1109/CSE.2013.156
2013-01-01
Abstract:Recently topic model has been more and more popular in lots of fields such as information retrieval and semantic relatedness computing, but its practical application is limited to the scalability of data. It cannot be efficiently executed on large-scale datasets in a parallel way. In this paper, we introduce an improved Regularized Latent Semantic Indexing(RLSI) with L1/2 regularization and non-negative constraints. This method formalizes topic model as a problem of minimizing a quadratic loss function regularized by L1/2 and L2 norm with non-negative constraints. This formulation allows the learning process to be decomposed into a series of mutually independent sub-optimization problems which can be processed in parallel, therefore, it has the ability to handle large-scale data. The non-negative constraints and L1/2 regularization allow our model to be more practical and more conducive to information retrieval and semantic relatedness computing. Extensive experimental results show that our improved model can deal with large-scale text data, and compared with some of the-state-of-the-art topic models, it is also very effective.
What problem does this paper attempt to address?