Abstract:Can we combine heterogenous graph structure with text to learn high-quality semantic and behavioural representations? Graph neural networks (GNN)s encode numerical node attributes and graph structure to achieve impressive performance in a variety of supervised learning tasks. Current GNN approaches are challenged by textual features, which typically need to be encoded to a numerical vector before provided to the GNN that may incur some information loss. In this paper, we put forth an efficient and effective framework termed language model GNN (LM-GNN) to jointly train large-scale language models and graph neural networks. The effectiveness in our framework is achieved by applying stage-wise fine-tuning of the BERT model first with heterogenous graph information and then with a GNN model. Several system and design optimizations are proposed to enable scalable and efficient training. LM-GNN accommodates node and edge classification as well as link prediction tasks. We evaluate the LM-GNN framework in different datasets performance and showcase the effectiveness of the proposed approach. LM-GNN provides competitive results in an Amazon query-purchase-product application.

What problem does this paper attempt to address?

The main problem that this paper attempts to solve is: how to combine heterogeneous graph structures and text information to learn high - quality semantic and behavioral representations. Specifically, existing graph neural networks (GNNs) face challenges when dealing with text features, because text usually needs to be encoded into numerical vectors first, which may lead to information loss. To this end, the authors propose an efficient and effective framework - Language Model Graph Neural Network (LM - GNN) for jointly training large - scale language models and graph neural networks. ### Main Problems 1. **Efficient Encoding of Text Features**: - When existing GNN methods deal with text features, they usually need to encode the text into numerical vectors, which may result in information loss. - The paper proposes a phased fine - tuning framework (LM - GNN). By first pre - training the BERT model with graph structure and then jointly fine - tuning it with the GNN model, this problem can be overcome. 2. **Efficient Training of Large - Scale Graph Data**: - Training of large - scale graph data involves challenges in efficiency and effectiveness, especially when jointly training large - scale language models and graph neural networks. - The paper proposes a series of system and design optimizations to achieve scalable and efficient training, including techniques such as caching BERT embeddings, joint negative sampling, and distributed GNN training. ### Solutions 1. **Phased Fine - Tuning Framework**: - **Semantic Encoder**: Use the pre - trained BERT model as a semantic encoder to encode the text information of nodes into embedding vectors. - **Graph Encoder**: Use the modified RGCN (Relational Graph Convolutional Network) as a graph encoder to capture graph structure information. - **Supervision Method**: Supervise the training of the model through structural prediction tasks (such as link prediction) and node prediction tasks (such as node classification). 2. **System and Design Optimizations**: - **Caching BERT Embeddings**: Cache the text embeddings of some nodes during the training process to reduce computational overhead. - **Joint Negative Sampling**: By sharing negative sample nodes, reduce the number of nodes in each mini - batch and accelerate training. - **Distributed GNN Training**: Utilize the distributed GNN training framework to achieve efficient training of large - scale graph data. ### Experimental Results - **Node Classification Task**: On public datasets (such as arXiv, product datasets, Yelp), the LM - GNN framework performs excellently in the node classification task, especially when using the graph - aware BERT model and the phased fine - tuning strategy. - **Link Prediction Task**: In the link prediction task, the LM - GNN framework also achieves very good performance, especially after pre - training with the graph - aware BERT model. ### Conclusion Through the proposed LM - GNN framework, the paper effectively solves the challenges of combining heterogeneous graph structures and text information and improves the performance of graph neural networks on multiple tasks. In particular, the phased fine - tuning strategy and a series of system optimization techniques enable this framework to achieve efficient and effective training on large - scale graph data.

Efficient and effective training of language and graph neural network models

Efficient Tuning and Inference for Large Language Models on Textual Graphs

LOGIN: A Large Language Model Consulted Graph Neural Network Training Framework

Language Models are Graph Learners

All Against Some: Efficient Integration of Large Language Models for Message Passing in Graph Neural Networks

Parameter-Efficient Tuning Large Language Models for Graph Representation Learning

Computation-friendly Graph Neural Network Design by Accumulating Knowledge on Large Language Models

Graph Learning in the Era of LLMs: A Survey from the Perspective of Data, Models, and Tasks

Bootstrapping Heterogeneous Graph Representation Learning via Large Language Models: A Generalized Approach

Can Large Language Models Act as Ensembler for Multi-GNNs?

GraphGPT: Graph Instruction Tuning for Large Language Models

Large Language Models Meet Graph Neural Networks: A Perspective of Graph Mining

Language is All a Graph Needs

Exploring the Potential of Large Language Models for Heterophilic Graphs

Large Language Models as Topological Structure Enhancers for Text-Attributed Graphs

Efficient End-to-end Language Model Fine-tuning on Graphs

GraphFormers: GNN-nested Language Models for Linked Text Representation

Scalable and Efficient Full-Graph GNN Training for Large Graphs

Let's Ask GNN: Empowering Large Language Model for Graph In-Context Learning

Graph Ladling: Shockingly Simple Parallel GNN Training without Intermediate Communication

GNN-LM: Language Modeling based on Global Contexts via GNN