Muli-label Text Categorization with Hidden Components.

Li,Longkai Zhang,Houfeng Wang
DOI: https://doi.org/10.3115/v1/d14-1193
2014-01-01
Abstract:Multi-label text categorization (MTC) is supervised learning, where a document may be assigned with multiple categories (labels) simultaneously. The labels in the MTC are correlated and the correlation results in some hidden components, which represent the ”share” variance of correlated labels. In this paper, we propose a method with hidden components for MTC. The proposed method employs PCA to capture the hidden components, and incorporates them into a joint learning framework to improve the performance. Experiments with real-world data sets and evaluation metrics validate the effectiveness of the proposed method.
What problem does this paper attempt to address?