Abstract:Distilling high-accuracy Graph Neural Networks~(GNNs) to low-latency multilayer perceptrons~(MLPs) on graph tasks has become a hot research topic. However, MLPs rely exclusively on the node features and fail to capture the graph structural information. Previous methods address this issue by processing graph edges into extra inputs for MLPs, but such graph structures may be unavailable for various scenarios. To this end, we propose a Prototype-Guided Knowledge Distillation~(PGKD) method, which does not require graph edges~(edge-free) yet learns structure-aware MLPs. Specifically, we analyze the graph structural information in GNN teachers, and distill such information from GNNs to MLPs via prototypes in an edge-free setting. Experimental results on popular graph benchmarks demonstrate the effectiveness and robustness of the proposed PGKD.

What problem does this paper attempt to address?

The main problem that this paper attempts to solve is to distill high - precision graph neural network (GNNs) knowledge into low - latency multi - layer perceptrons (MLPs), while enabling MLPs to capture graph - structure information, and achieving this goal even without graph - edge information. Specifically, the paper proposes a prototype - guided knowledge distillation (PGKD) method. This method can transfer the graph - structure information in GNNs to MLPs through class prototypes without using graph - edge information (i.e., the "edgeless" setting), thus making MLPs structure - aware. ### Main Problems and Solutions 1. **Problems**: - **High Latency of GNNs**: The high - latency problem of GNNs caused by their message - passing architecture limits their use in real - time applications. - **Lack of Structure Information in MLPs**: Although MLPs have high computational efficiency, they lack the ability to capture graph - structure information, resulting in poor performance in graph tasks. - **Limitations of Existing Methods**: Existing methods either rely on graph - edge information, which is not feasible in some scenarios (such as federated graph learning), or inject graph - structure information through regularization terms, but these methods are not related to the GNN teacher model. 2. **Solutions**: - **PGKD Method**: The paper proposes a new prototype - guided knowledge distillation method (PGKD). This method analyzes the graph - structure information in GNNs and designs additional loss functions to transfer this information to MLPs, so that MLPs can have structure - awareness without using graph - edge information. - **Class - Prototype Strategy**: By calculating the prototype vector of each class (i.e., the average representation of all nodes in that class), PGKD designs two types of loss functions: intra - class loss and inter - class loss. The intra - class loss makes the node representations of the same class closer to their corresponding prototypes, while the inter - class loss transfers the inter - class relationship patterns in GNNs by adjusting the distances between different class prototypes. ### Experimental Results The paper conducted experiments on several popular graph benchmark datasets, including Cora, Citeseer, A - computer, Penn94, Pubmed, and Twitch - gamer. The experimental results show that PGKD performs excellently in transmitting graph - structure information. It not only outperforms the baseline method GLNN in average accuracy but also has a smaller standard deviation, showing its stability and robustness. ### Innovations 1. **Structure - Awareness in Edgeless Setting**: PGKD is the first method to achieve structure - awareness in MLPs in an edgeless setting. 2. **Innovative Application of Class - Prototypes**: Through the class - prototype strategy, PGKD effectively transfers the graph - structure information in GNNs. This is the first time that prototypes have been used for knowledge distillation from GNNs to MLPs. 3. **Wide Applicability**: PGKD performs well on various GNN models (such as GraphSAGE, GAT, GCN, APPNP) and different datasets, showing its wide application potential. ### Future Work The paper also proposes future research directions, including extending PGKD to other graph tasks (such as graph classification, graph regression, etc.), and exploring methods of generating prototypes using node representations to further improve the performance and robustness of the model.

Edge-free but Structure-aware: Prototype-Guided Knowledge Distillation from GNNs to MLPs

Learning Structure Perception MLPs on Graphs: a Layer-Wise Graph Knowledge Distillation Framework

Knowledge Distillation Improves Graph Structure Augmentation for Graph Neural Networks

Decoupled graph knowledge distillation: A general logits-based method for learning MLPs on graphs

Adaptive Hierarchical Knowledge Distillation from GNNs to MLPs

Graph-less Neural Networks: Teaching Old MLPs New Tricks via Distillation

Unveiling the Unseen Potential of Graph Learning through MLPs: Effective Graph Learners Using Propagation-Embracing MLPs

Teaching MLP More Graph Information: A Three-stage Multitask Knowledge Distillation Framework

Extracting Low-/High- Frequency Knowledge from Graph Neural Networks and Injecting it into MLPs: An Effective GNN-to-MLP Distillation Framework

A Teacher-Free Graph Knowledge Distillation Framework with Dual Self-Distillation

Distill Graph Structure Knowledge from Masked Graph Autoencoders into MLP

Propagate & Distill: Towards Effective Graph Learners Using Propagation-Embracing MLPs

Enhanced Scalable Graph Neural Network via Knowledge Distillation

AdaGMLP: AdaBoosting GNN-to-MLP Knowledge Distillation

SA-MLP: Distilling Graph Knowledge from GNNs into Structure-Aware MLP

Quantifying the Knowledge in GNNs for Reliable Distillation into MLPs

Knowledge Distillation Via Adaptive Meta-Learning for Graph Neural Network

Frameless Graph Knowledge Distillation

Graph Knowledge Distillation to Mixture of Experts

VQGraph: Rethinking Graph Representation Space for Bridging GNNs and MLPs

Narrow the Input Mismatch in Deep Graph Neural Network Distillation