Graph Neural Networks on Discriminative Graphs of Words

Yassine Abbahaddou,Johannes F. Lutzeyer,Michalis Vazirgiannis
2024-10-27
Abstract:In light of the recent success of Graph Neural Networks (GNNs) and their ability to perform inference on complex data structures, many studies apply GNNs to the task of text classification. In most previous methods, a heterogeneous graph, containing both word and document nodes, is constructed using the entire corpus and a GNN is used to classify document nodes. In this work, we explore a new Discriminative Graph of Words Graph Neural Network (DGoW-GNN) approach encapsulating both a novel discriminative graph construction and model to classify text. In our graph construction, containing only word nodes and no document nodes, we split the training corpus into disconnected subgraphs according to their labels and weight edges by the pointwise mutual information of the represented words. Our graph construction, for which we provide theoretical motivation, allows us to reformulate the task of text classification as the task of walk classification. We also propose a new model for the graph-based classification of text, which combines a GNN and a sequence model. We evaluate our approach on seven benchmark datasets and find that it is outperformed by several state-of-the-art baseline models. We analyse reasons for this performance difference and hypothesise under which conditions it is likely to change.
Machine Learning,Computation and Language
What problem does this paper attempt to address?
### What problem does this paper attempt to solve? This paper aims to solve a key problem in text classification: how to use graph neural networks (GNNs) to more effectively capture the structural information of text, thereby improving classification performance. Specifically, the author proposes a new graph neural network method (DGoW - GNN) based on the Discriminative Graph of Words (DGoW) to improve the performance of existing methods in text classification tasks. #### Main problems and innovation points: 1. **Limitations of existing methods**: - Most existing methods construct a Mixed Graph of Words (MGoW), which contains document nodes and word nodes, and uses the entire corpus to construct a heterogeneous graph. This graph structure may not be able to separate information of different categories well. - These methods usually rely on training and test documents to jointly construct a graph, resulting in difficulty in directly predicting the labels of unseen documents (that is, a problem belonging to the inductive learning setting). 2. **Core ideas of the new method**: - **Discriminative Graph of Words (DGoW)**: The author proposes to divide the training corpus into multiple disconnected sub - graphs according to category labels. Each sub - graph contains only word nodes and no document nodes. In this way, information of different categories can be better separated. - **Edge weight calculation**: In DGoW, the weight of an edge is determined by the point - wise mutual information (PMI) of word pairs, which helps to capture the association between words. - **Task re - definition**: The author re - defines the text classification task as a path classification task, that is, predicting whether a sentence can be represented as a path in a certain category sub - graph. 3. **Model architecture**: - **Combining GNN and sequence model**: The proposed DGoW - GNN model combines a graph neural network (GNN) and a bidirectional long - short - term memory network (Bi - LSTM) to simultaneously utilize the global structural information and local sequential information of words. - **Classification process**: For each sentence, the model calculates the probability of its appearance in each category sub - graph and selects the category with the highest probability as the prediction result. 4. **Experimental verification**: - The author conducted extensive experiments on seven benchmark datasets to evaluate the performance of DGoW - GNN and compared it with several state - of - the - art baseline models. - The experimental results show that although DGoW - GNN fails to outperform all baseline models on some datasets, it shows its superiority in theoretical analysis and some experiments, especially in category separation. In summary, this paper attempts to solve the problem of insufficient capture of structural information in text classification by introducing the Discriminative Graph of Words (DGoW) and the corresponding graph neural network (DGoW - GNN), thereby improving classification performance.