Research of Spectral Clustering Based on Probabilistic Latent Semantic Analysis

ZHANG Yufang,ZHANG Hong,XIONG Zhongyang,LI Wentian
DOI: https://doi.org/10.3778/j.issn.1002-8331.2011.36.037
2011-01-01
Computer Engineering and Applications Journal
Abstract:Traditional similar matrix of spectral clustering is dependent on vector space model,which regards index word as independent unit and ignores a large number of synonyms and polysemy existing in natural language.To solve this problem,the paper comes up with a new method of extracting semantic information implicit in the text and constructing the similar matrix based on Probabilistic Latent Semantic Analysis(PLSA),which takes into account the similarities of the texts.Experiments indicate that such similar matrix built by PLSA can greatly improve categorization precision,and bring better results than traditional way like spectral clustering,further proves the availability of PLSA.
What problem does this paper attempt to address?