Multi-label text classification based on semantic-sensitive graph convolutional network

Delong Zeng,Enze Zha,Jiayi Kuang,Ying Shen
DOI: https://doi.org/10.1016/j.knosys.2023.111303
IF: 8.139
2024-01-01
Knowledge-Based Systems
Abstract:Multi-Label Text Classification (MLTC) is an important but challenging task in the field of natural language processing. In this paper, we propose a novel method, Semantic-sensitive Graph Convolutional Network (S-GCN), by simultaneously considering semantic and word-global associations. More specifically, we first leverage texts, words, and labels to construct a global graph, which helps mine the relevance between similar documents. Then we design and pre-train an encoder to initialize text nodes in the graph, from which the semantic features of documents are extracted. Next, we employ a graph convolutional network to classify text nodes, which can well fuse node information. Finally, we normalize the adjacency matrix and store hidden layer representations of word nodes, tackling the issue that conventional graph-based methods cannot predict texts that did not appear during training. We conduct experiments on three public datasets, AAPD, RMSC-V2, and Reuters-21578, and demonstrate the superiority of our model over the baselines on the MLTC task. Source code is available at https://github.com/sysu18364004/SGCN.
computer science, artificial intelligence
What problem does this paper attempt to address?
### Problems the Paper Aims to Solve This paper aims to address the challenges in Multi-Label Text Classification (MLTC). MLTC is a significant task in natural language processing, with the goal of predicting multiple labels for a given text. Specifically, the paper proposes a new method—Semantic-sensitive Graph Convolutional Network (S-GCN), which improves the performance of multi-label text classification by simultaneously considering semantic information and global word associations. ### Main Challenges 1. **Text Modeling Ignores Global Information**: Traditional text modeling methods usually focus only on local features, ignoring the global co-occurrence relationships between words. 2. **Graph Neural Networks Cannot Predict New Samples**: Existing graph-based methods require retraining the model during the testing phase, which significantly increases the time cost. 3. **Label Correlation**: In multi-label text classification, there may be correlations between different labels, and effectively capturing these correlations is a challenge. ### Solutions To address the above challenges, the paper proposes the following solutions: 1. **Constructing a Global Graph**: By treating texts, words, and labels as nodes, a global graph is constructed to capture the similarity between documents and the global co-occurrence relationships of words. 2. **Initializing Text Nodes**: An encoder is designed and pre-trained to initialize the text nodes in the graph, extracting the semantic features of the documents. 3. **Graph Convolutional Network Classification**: A graph convolutional network is used to classify the text nodes, effectively integrating node information. 4. **Storing Hidden Layer Representations**: During training, the hidden layer representations of word nodes are stored, so that the features of test samples can be obtained through adjacency relationships during testing, solving the problem of graph neural networks being unable to predict new samples. ### Experimental Validation The paper conducts experiments on three public datasets (AAPD, RMSC-V2, Reuters-21578), and the results show that S-GCN outperforms other baseline models in the multi-label text classification task. Additionally, the paper answers the following research questions through experiments: - **RQ1**: Does adding labeled nodes in the global graph help in inference? - **RQ2**: Does training LSTM on specific tasks help in node initialization? - **RQ3**: Does storing hidden layer representations effectively improve the inference efficiency of TextGCN? ### Summary The main contributions of the paper include: - Proposing a new Semantic-sensitive Graph Convolutional Network (S-GCN) that captures the global co-occurrence relationships between words by constructing a global graph. - Designing an efficient text node initialization method that provides rich semantic information, promoting rapid model convergence. - Experimental results on three public datasets show that S-GCN outperforms other models in the multi-label text classification task.