THE IMPROVEMENT OF VSM MODEL BASED ON SEMANTICS

Su Yu,Zheng Cheng,Ma Zhongjie
DOI: https://doi.org/10.3969/j.issn.1000-386X.2011.08.046
2011-01-01
Abstract:Text clustering is widely applied in many fields.However,traditional methods of text clustering do not consider the semantic factors;consequently,their clustering effect is not satisfactory.In this paper,we use semantics to transform VSM model,i.e.to distort each dimension of VSM model based on semantics,to transform original orthogonal coordinate system into oblique coordinate system based on semantics,and then to map the eigenvectors of the text onto the transformed VSM model.The clustering will be conducted after these have been done.This clustering method can relatively diminish semantic distances between the eigenvectors which are semantically relevant,therefore can raise the recall rate and precision rate of the text clustering,and make the clustering results more semantic.
What problem does this paper attempt to address?