Abstract:Large-scale Pre-trained Language Models (PLMs) have become the backbones of text classification due to their exceptional performance. However, they treat input documents as independent and uniformly distributed, thereby disregarding potential relationships among the documents. This limitation could lead to some miscalculations and inaccuracies in text classification. To address this issue, some recent work explores the integration of Graph Neural Networks (GNNs) with PLMs, as GNNs can effectively model document relationships. Yet, combining graph-based methods with PLMs is challenging due to the structural incompatibility between graphs and sequences. To tackle this challenge, we propose a graph-enhanced text mutual learning framework that integrates graph-based models with PLMs to boost classification performance. Our approach separates graph-based methods and language models into two independent channels and allows them to approximate each other through mutual learning of probability distributions. This probability-distribution-guided approach simplifies the adaptation of graph-based models to PLMs and enables seamless end-to-end training of the entire architecture. Moreover, we introduce Asymmetrical Learning, a strategy that enhances the learning process, and incorporate Uncertainty Weighting loss to achieve smoother probability distribution learning. These enhancements significantly improve the performance of mutual learning. The practical value of our research lies in its potential applications in various industries, such as social network analysis, information retrieval, and recommendation systems, where understanding and leveraging document relationships are crucial. Importantly, our method can be easily combined with different PLMs and consistently achieves state-of-the-art results on multiple public datasets.

ELMER: A Non-Autoregressive Pre-trained Language Model for Efficient and Effective Text Generation

Open-ended Long Text Generation via Masked Language Modeling

Dynamic and Efficient Inference for Text Generation Via BERT Family

Attentive Multi-Layer Perceptron for Non-autoregressive Generation

Residual Energy-Based Models for Text Generation

Non-Autoregressive Text Generation with Pre-trained Language Models

ERNIE-GEN: An Enhanced Multi-Flow Pre-training and Fine-tuning Framework for Natural Language Generation

P$^3$LM: Probabilistically Permuted Prophet Language Modeling for Generative Pre-Training

Pretrained Language Models for Text Generation: A Survey

PaDeLLM-NER: Parallel Decoding in Large Language Models for Named Entity Recognition

ERNIE-Gram: Pre-Training with Explicitly N-Gram Masked Language Modeling for Natural Language Understanding.

Exploration of Masked and Causal Language Modelling for Text Generation

Pre-trained Language Models for Text Generation: A Survey

AR-Diffusion: Auto-Regressive Diffusion Model for Text Generation

Promises and Pitfalls of Generative Masked Language Modeling: Theoretical Framework and Practical Guidelines

Improving Multilingual and Code-Switching ASR Using Large Language Model Generated Text

An Empirical Study and Analysis of Text-to-Image Generation Using Large Language Model-Powered Textual Representation

GEML: a graph-enhanced pre-trained language model framework for text classification via mutual learning

LERT: A Linguistically-motivated Pre-trained Language Model

A Novel Optimization Scheme for Named Entity Recognition with Pre-trained Language Models

Diffusion-NAT: Self-Prompting Discrete Diffusion for Non-Autoregressive Text Generation