Label correlation mixture model for multi-label text categorization

Zhiyang He,Ji Wu,Ping Lv
DOI: https://doi.org/10.1109/SLT.2014.7078554
2014-01-01
Abstract:Multi-label text categorization is more difficult but practical than the conventional binary or multi-class text categorization. This paper propose a novel probabilistic generative model, label correlation mixture model (LCMM), to depict the multiple labeled documents, which can be used for multi-label text categorization. In LCMM, labels and topics have the one-to-one correspondences. LCMM consists of two parts: label correlation model and multi-label conditioned document model. The former one formulates the generating process of labels and the dependencies between the labels are taken into account. We also propose an efficient algorithm for calculating the probability of generating an arbitrary subset of labels. Multi-label conditioned document model can be regarded as a supervised label mixture model, in which the labels for a document are known. To evaluate LCMM, multi-label text categorization experiments on three standard text data sets are performed. The experimental results demonstrate the effectiveness of LCMM, comparing to other reported methods.
What problem does this paper attempt to address?