Class Hierarchical Structure-Based Text Classification

Xiaoyun Chen,Jinhua Chen
DOI: https://doi.org/10.4028/www.scientific.net/amr.255-260.2233
2011-01-01
Advanced Materials Research
Abstract:There is a problem that the difficulty in text classification will increase when the number of classes increases, to which hierarchical structure is a viable solution. Well, a document’s hierarchical structure is usually maintained only by hand, which require substantial manpower to find the correct position of a document in the class hierarchy or to reconstruct the hierarchy. Constructing the hierarchical structure automatically by clustering the training samples can effectively reduce the cost of manual maintenance, and at the same time, it can avoid the conflict between the prior knowledge and the statistical properties of the sample set caused by artificial maintenance of the hierarchy.
What problem does this paper attempt to address?