Abstract:We propose DyGFormer, a new Transformer-based architecture for dynamic graph learning. DyGFormer is conceptually simple and only needs to learn from nodes' historical first-hop interactions by: (1) a neighbor co-occurrence encoding scheme that explores the correlations of the source node and destination node based on their historical sequences; (2) a patching technique that divides each sequence into multiple patches and feeds them to Transformer, allowing the model to effectively and efficiently benefit from longer histories. We also introduce DyGLib, a unified library with standard training pipelines, extensible coding interfaces, and comprehensive evaluating protocols to promote reproducible, scalable, and credible dynamic graph learning research. By performing exhaustive experiments on thirteen datasets for dynamic link prediction and dynamic node classification tasks, we find that DyGFormer achieves state-of-the-art performance on most of the datasets, demonstrating its effectiveness in capturing nodes' correlations and long-term temporal dependencies. Moreover, some results of baselines are inconsistent with previous reports, which may be caused by their diverse but less rigorous implementations, showing the importance of DyGLib. All the used resources are publicly available at <a class="link-external link-https" href="https://github.com/yule-BUAA/DyGLib" rel="external noopener nofollow">this https URL</a>.
What problem does this paper attempt to address?
### Problems the Paper Aims to Solve
The paper aims to address two main issues in dynamic graph learning:
1. **Insufficient Capture of Node Correlation and Long-term Temporal Dependencies**:
- Most existing dynamic graph learning methods calculate the temporal representation of each node independently when handling interactions between nodes, without fully utilizing the correlations between nodes. These correlations can often indicate future interactions.
- Existing methods primarily learn at the interaction level, thus can only handle nodes with fewer interactions. When nodes have a long history, sampling strategies are needed to truncate interactions for feasible computation of high-cost modules (such as graph convolution, temporal random walk, and sequence models). Although some methods use memory networks to sequentially process interactions, they face issues of gradient vanishing or explosion.
2. **Poor Reproducibility Due to Inconsistent Training Pipelines**:
- Inconsistent training pipelines across different methods often lead to poor reproducibility.
- Existing methods use different frameworks (such as PyTorch, TensorFlow, DGL, PyG, C++), making it difficult for researchers to quickly understand algorithms and delve into core issues of dynamic graph learning.
- Although some dynamic graph learning libraries exist, they mainly focus on dynamic network embedding methods, discrete-time graph learning methods, or engineering techniques for large-scale dynamic graph training, lacking standard tools for continuous-time dynamic graph learning.
To address these issues, the paper proposes two key technical contributions:
1. **A New Transformer-based Dynamic Graph Learning Architecture (DyGFormer)**:
- DyGFormer is conceptually simple by learning only from the historical one-hop interaction sequences of nodes. It designs a neighbor co-occurrence encoding scheme to explicitly explore the correlations by encoding the frequency of each neighbor's occurrence in the source and target node sequences.
- To capture long-term temporal dependencies, DyGFormer segments each node's sequence into multiple chunks and inputs these chunks into the Transformer. This chunking technique not only enables the model to effectively utilize longer histories but also efficiently reduces computational complexity by maintaining high local temporal proximity.
2. **A Unified Continuous-time Dynamic Graph Learning Library (DyGLib)**:
- DyGLib is an open-source toolkit with standardized training pipelines, extensible encoding interfaces, and comprehensive evaluation strategies, aiming to promote standardized, scalable, and reproducible dynamic graph learning research.
- DyGLib integrates multiple continuous-time dynamic graph learning methods and benchmark datasets from different domains, training all methods through the same pipeline, eliminating the impact of different implementations, and adopting a modular design that allows developers to easily integrate new datasets and algorithms according to specific needs.
- DyGLib supports dynamic link prediction and dynamic node classification tasks, providing comprehensive evaluation strategies for thorough comparison of existing methods.