Learning Semantic Similarity For Multi-Label Text Categorization

Li Li,Mengxiang Wang,Longkai Zhang,Houfeng Wang
DOI: https://doi.org/10.1007/978-3-319-14331-6_26
2014-01-01
Abstract:The multi-label text categorization is supervised learning, where a document is associated with multiple labels simultaneously. The current multi-label text categorization approaches suffer from limitations when the expensive labelled text data is little but the unlabelled text data is abundant, because they are unable to exploit information from unlabelled text data. To address this problem, we learn the word semantic similarity by deep learning using the unlabelled text data, and then incorporate the learned word semantic similarity into current multi-label text categorization approaches. We conduct experiments with the Slash-dot and Tmc2007 datasets, and these experiments demonstrate our proposed method will greatly improve the performance of current multi-label text categorization approaches.
What problem does this paper attempt to address?