Abstract:Based on the message-passing paradigm, there has been an amount of research proposing diverse and impressive feature propagation mechanisms to improve the performance of GNNs. However, less focus has been put on feature transformation, another major operation of the message-passing framework. In this paper, we first empirically investigate the performance of the feature transformation operation in several typical GNNs. Unexpectedly, we notice that GNNs do not completely free up the power of the inherent feature transformation operation. By this observation, we propose the Bi-directional Knowledge Transfer (BiKT), a plug-and-play approach to unleash the potential of the feature transformation operations without modifying the original architecture. Taking the feature transformation operation as a derived representation learning model that shares parameters with the original GNN, the direct prediction by this model provides a topological-agnostic knowledge feedback that can further instruct the learning of GNN and the feature transformations therein. On this basis, BiKT not only allows us to acquire knowledge from both the GNN and its derived model but promotes each other by injecting the knowledge into the other. In addition, a theoretical analysis is further provided to demonstrate that BiKT improves the generalization bound of the GNNs from the perspective of domain adaption. An extensive group of experiments on up to 7 datasets with 5 typical GNNs demonstrates that BiKT brings up to 0.5% - 4% performance gain over the original GNN, which means a boosted GNN is obtained. Meanwhile, the derived model also shows a powerful performance to compete with or even surpass the original GNN, enabling us to flexibly apply it independently to some other specific downstream tasks.

What problem does this paper attempt to address?

The main problem that this paper attempts to solve is that the feature transformation operation (T - operation) in graph neural networks (GNNs) has not been fully utilized. Specifically, the paper points out: 1. **Under - utilization of feature transformation operation**: In GNNs based on the message - passing framework, although a great deal of research has focused on the feature propagation operation (P - operation) to improve the performance of GNNs, the feature transformation operation (T - operation) has received relatively little attention. Through experimental observations, the author found that GNNs have not fully unleashed the potential of the feature transformation operation. 2. **Interactive influence between feature transformation and feature propagation**: The paper further explores the interaction between the feature transformation operation and the feature propagation operation. The study found that the structural bias introduced by the explicit application of the P - operation does indeed affect the representation modeling of node content features by the T - operation. This indicates that the P - operation can not only enhance the ability of GNNs to handle non - Euclidean data, but also affect the effectiveness of the T - operation. Based on the above problems, the paper proposes the Bidirectional Knowledge Transfer (BiKT) method, which aims to fully utilize the potential of the feature transformation operation through bidirectional knowledge transfer without modifying the original GNN architecture, thereby further improving the performance of GNNs. Specifically, BiKT is achieved in the following ways: - **Generative model**: Use the generative model to capture the representation distributions of GNNs and MLP GNNs, thereby extracting model knowledge. - **Knowledge injection**: Inject the knowledge learned from one model (the target model) into another model (the source model) and achieve knowledge transfer by optimizing the objective function. - **Parameter inheritance**: Implement recursive training through parameter inheritance and gradually integrate the knowledge of both sides. Through these methods, BiKT can not only significantly improve the performance of existing GNNs, but also enable the derived MLP GNNs to perform on a par with or even outperform the original GNNs in certain tasks, thus providing a flexible application option.

Unleashing the potential of GNNs via Bi-directional Knowledge Transfer

Knowledge Distillation Improves Graph Structure Augmentation for Graph Neural Networks

Boosting Graph Neural Networks via Adaptive Knowledge Distillation

PT-KGNN: A framework for pre-training biomedical knowledge graphs with graph neural networks

Transferring Core Knowledge via Learngenes

Gated Transfer Network for Transfer Learning

Enhanced Scalable Graph Neural Network via Knowledge Distillation

Human-centric Transfer Learning Explanation Via Knowledge Graph [extended Abstract]

Shared Growth of Graph Neural Networks via Prompted Free-direction Knowledge Distillation

MPTN: A message-passing transformer network for drug repurposing from knowledge graph

AKE-GNN: Effective Graph Learning with Adaptive Knowledge Exchange

Fantastic Gains and Where to Find Them: On the Existence and Prospect of General Knowledge Transfer between Any Pretrained Model

From Stars to Subgraphs: Uplifting Any GNN with Local Structure Awareness

AutoTransfer: AutoML with Knowledge Transfer -- An Application to Graph Neural Networks

Multi-source Inductive Knowledge Graph Transfer

A data-centric framework of improving graph neural networks for knowledge graph embedding

TransGNN: Harnessing the Collaborative Power of Transformers and Graph Neural Networks for Recommender Systems

Enhancing Accuracy in Generative Models via Knowledge Transfer

Can Transformer and GNN Help Each Other?

InterpGNN: Understand and Improve Generalization Ability of Transdutive GNNs Through the Lens of Interplay Between Train and Test Nodes

TransGNN: A Transductive Graph Neural Network with Graph Dynamic Embedding