LDA-based document models for cross language information retrieval

Yuan Lin,Hongfei Lin,Kan Xu,Jian Ning
DOI: https://doi.org/10.12733/jcis6904
2013-01-01
Journal of Computational Information Systems
Abstract:This paper proposes a bilingual biomedical topic space model in which both Chinese and English abstract are represented by improved Latent Dirichlet Allocation with SVD and NMF decomposition smoothing methods. The improved LDA-based method is used to calculate the relationship between query and document and combine the results of different matrix factorization. A set of different dimension models is set up, with the help of which we can achieve bilingual cross-language indexing. The experimental results show that our method can improve the retrieval accuracies effectively. © 2013 Binary Information Press.
What problem does this paper attempt to address?