Chinese Text Classification Based on Tensor Space Model

HE Wei,HU Xue-gang,XIE Fei
DOI: https://doi.org/10.3969/j.issn.1003-5060.2010.12.011
2010-01-01
Abstract:In this paper,a Chinese text categorization method based on tensor space model is proposed in view of the complex preprocessor and dimension disasters of high-dimensional data caused by traditional vector space model.It uses the three-order tensor to express the text sets,and expands the k-nearest neighbors(kNN) classifier based on vector to the one based on tensor.This method simplifies the preprocessor,increases the precision level,and makes it possible to apply more tensor methods to Chinese text categorization.The results show the this method possesses higher categorization precision and practical value.
What problem does this paper attempt to address?