A Linkage Clustering Based Query Expansion Algorithm

Li Pohan,He Zhenying,Xiang Helin
2011-01-01
Journal of Computer Research and Development
Abstract:Latent Semantic Analysis(LSA) is a method used to automatically extract and express the knowledge,it explores the potential links between the words by setting on a large number of statistical analysis of text.LSA effectively solves the problem of polysemy,but because of the bottleneck in large matrix computation efficiency and lack of storage,LSA is limited in the application of large-scale data sets.On the other hand,Data objects in a relational database are cross-linked with each other via multi-typed links.These links are a wealth of latent semantic information.So,Similarity between data objects can also be reflected through these links.In this paper,we proposed a new query algorithm based on linkage-based clustering:clustering the data objects by using the linkages,then using the cluster instead of the documents to do LSA analysis,this approach can efficiently reduce the number of processing documents;during the retrieval process,finding the closest sequence and the return the documents in the cluster to users.Experimental results show that the proposed approach can take full advantage of the linkages between data objects,clustering effect is obvious;LSA analysis after clustering can exponentially improve the space and time overhead,the role of accuracy has been improved.
What problem does this paper attempt to address?