Abstract:Distilling high-accuracy Graph Neural Networks~(GNNs) to low-latency multilayer perceptrons~(MLPs) on graph tasks has become a hot research topic. However, MLPs rely exclusively on the node features and fail to capture the graph structural information. Previous methods address this issue by processing graph edges into extra inputs for MLPs, but such graph structures may be unavailable for various scenarios. To this end, we propose a Prototype-Guided Knowledge Distillation~(PGKD) method, which does not require graph edges~(edge-free) yet learns structure-aware MLPs. Specifically, we analyze the graph structural information in GNN teachers, and distill such information from GNNs to MLPs via prototypes in an edge-free setting. Experimental results on popular graph benchmarks demonstrate the effectiveness and robustness of the proposed PGKD.
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is to distill high - precision graph neural network (GNNs) knowledge into low - latency multi - layer perceptrons (MLPs), while enabling MLPs to capture graph - structure information, and achieving this goal even without graph - edge information. Specifically, the paper proposes a prototype - guided knowledge distillation (PGKD) method. This method can transfer the graph - structure information in GNNs to MLPs through class prototypes without using graph - edge information (i.e., the "edgeless" setting), thus making MLPs structure - aware.
### Main Problems and Solutions
1. **Problems**:
- **High Latency of GNNs**: The high - latency problem of GNNs caused by their message - passing architecture limits their use in real - time applications.
- **Lack of Structure Information in MLPs**: Although MLPs have high computational efficiency, they lack the ability to capture graph - structure information, resulting in poor performance in graph tasks.
- **Limitations of Existing Methods**: Existing methods either rely on graph - edge information, which is not feasible in some scenarios (such as federated graph learning), or inject graph - structure information through regularization terms, but these methods are not related to the GNN teacher model.
2. **Solutions**:
- **PGKD Method**: The paper proposes a new prototype - guided knowledge distillation method (PGKD). This method analyzes the graph - structure information in GNNs and designs additional loss functions to transfer this information to MLPs, so that MLPs can have structure - awareness without using graph - edge information.
- **Class - Prototype Strategy**: By calculating the prototype vector of each class (i.e., the average representation of all nodes in that class), PGKD designs two types of loss functions: intra - class loss and inter - class loss. The intra - class loss makes the node representations of the same class closer to their corresponding prototypes, while the inter - class loss transfers the inter - class relationship patterns in GNNs by adjusting the distances between different class prototypes.
### Experimental Results
The paper conducted experiments on several popular graph benchmark datasets, including Cora, Citeseer, A - computer, Penn94, Pubmed, and Twitch - gamer. The experimental results show that PGKD performs excellently in transmitting graph - structure information. It not only outperforms the baseline method GLNN in average accuracy but also has a smaller standard deviation, showing its stability and robustness.
### Innovations
1. **Structure - Awareness in Edgeless Setting**: PGKD is the first method to achieve structure - awareness in MLPs in an edgeless setting.
2. **Innovative Application of Class - Prototypes**: Through the class - prototype strategy, PGKD effectively transfers the graph - structure information in GNNs. This is the first time that prototypes have been used for knowledge distillation from GNNs to MLPs.
3. **Wide Applicability**: PGKD performs well on various GNN models (such as GraphSAGE, GAT, GCN, APPNP) and different datasets, showing its wide application potential.
### Future Work
The paper also proposes future research directions, including extending PGKD to other graph tasks (such as graph classification, graph regression, etc.), and exploring methods of generating prototypes using node representations to further improve the performance and robustness of the model.