SMM,A Sense Matrix Model

Sun Bin
2005-01-01
Abstract:This paper presents a brief introduction of the Sense Matrix Model (SMM),which employs a matrix representation of text for information retrieval.By taking the distribution of words in the sense direction into account,SMM represents a document as a term sense matrix and a document collection as a term sense document space.With such a document representation,some useful data analysis techniques can be introduced or developed,including matrix norms based similarities,sense weighting,document transforms with DCT as well as MAD (multi way data decomposition),kNN and SVM classification using the sense matrix representation,etc.The model also provides novel techniques for cross lingual IR and multi lingual text classification without using any separated or integrated translation or“model training.Some initial experiment results of document DCT with the SMART IR system are also discussed.
What problem does this paper attempt to address?