Effective Collaborative Representation Learning for Multilabel Text Categorization

Hao Wu,Shaowei Qin,Rencan Nie,Jinde Cao,Sergey Gorbachev

DOI: https://doi.org/10.1109/tnnls.2021.3069647

IF: 14.255

2021-01-01

IEEE Transactions on Neural Networks and Learning Systems

Abstract:With the booming of deep learning, massive attention has been paid to developing neural models for multilabel text categorization (MLTC). Most of the works concentrate on disclosing word–label relationship, while less attention is taken in exploiting global clues, particularly with the relationship of document–label. To address this limitation, we propose an effective collaborative representation learning (CRL) model in this article. CRL consists of a factorization component for generating shallow representations of documents and a neural component for deep text-encoding and classification. We have developed strategies for jointly training those two components, including an alternating-least-squares-based approach for factorizing the pointwise mutual information (PMI) matrix of label–document and multitask learning (MTL) strategy for the neural component. According to the experimental results on six data sets, CRL can explicitly take advantage of the relationship of document–label and achieve competitive classification performance in comparison with some state-of-the-art deep methods.

computer science, artificial intelligence, theory & methods,engineering, electrical & electronic, hardware & architecture

What problem does this paper attempt to address?

The paper primarily focuses on addressing issues in Multilabel Text Categorization (MLTC), particularly on how to better utilize global cues between documents and labels to improve text representation methods. Specifically, the paper presents the following key contributions: 1. **Proposing an effective Collaborative Representation Learning (CRL) framework**: This framework aims to fully leverage word-label relationships as well as document-label relationships, generating better document and label representations through collaborative training. 2. **Developing an effective joint training strategy**: This includes a Pointwise Mutual Information (PMI) matrix factorization method based on Alternating Least Squares and a Multitask Learning (MTL) strategy for neural network components. 3. **Significantly improving the accuracy of MLTC**: Extensive experiments have validated that the proposed CRL framework can significantly enhance the performance of multilabel text classification and has been compared with some advanced deep learning methods. The paper points out that most existing work focuses on revealing word-label relationships, with less attention given to global cues such as document-label relationships. To address this limitation, the authors propose an effective collaborative representation learning model that includes both a factorization component and a neural network component. The factorization component is used to generate shallow representations of documents and labels, while the neural network component is responsible for deep text encoding and classification. These two components are jointly trained to achieve the learning of document and label representations. Through this approach, CRL can benefit not only from word-label relationships but also from document-label relationships, thereby achieving better text representation and classification performance within a unified framework.

Effective Collaborative Representation Learning for Multilabel Text Categorization

Dual Enhancement for Multi-Label Learning with Missing Labels

Partial Multi-label Learning with Label and Feature Collaboration

Semi-Supervised Dual Relation Learning for Multi-Label Classification

Muli-label Text Categorization with Hidden Components.

Collaborative Work with Linear Classifier and Extreme Learning Machine for Fast Text Categorization

Collaborative Multilabel Classification

Enhancing Label Correlation Feedback in Multi-Label Text Classification via Multi-Task Learning

Research of multi-label text classification based on label attention and correlation networks

A Deep Multi-Task Representation Learning Method for Time Series Classification and Retrieval.

A Label Information Aware Model for Multi-label Text Classification

Visual-Language Collaborative Representation Network for Broad-Domain Few-Shot Image Classification

Label Correlation Mixture Model: A Supervised Generative Approach to Multilabel Spoken Document Categorization

Rethinking Modal-oriented Label Correlations for Multi-modal Multi-label Learning

Collaboration based multi-modal multi-label learning

Label correlation mixture model for multi-label text categorization

Learning Semantic Similarity For Multi-Label Text Categorization

A Hybrid Model Based on Convolutional Neural Network and Long Short-Term Memory for Multi-label Text Classification

Learning Disentangled Label Representations for Multi-label Classification

Dual-Perspective Semantic-Aware Representation Blending for Multi-Label Image Recognition with Partial Labels

MFLSCI: Multi-granularity fusion and label semantic correlation information for multi-label legal text classification