Heterogeneous Data Augmentation in Graph Contrastive Learning for Effective Negative Samples

Adnan Ali,Jinlong Li,Huanhuan Chen
DOI: https://doi.org/10.1016/j.compeleceng.2024.109304
IF: 4.152
2024-01-01
Computers & Electrical Engineering
Abstract:Graph contrastive learning (GCL) aims to contrast positive-negative counterparts, whereas graph data augmentation (GDA) in GCL is employed to generate positive-negative samples. Existing GDA techniques, such as 1 -dimensional (1D) feature masking, suffer from having a high probability of augmenting partially homogeneous views. Homogeneous views encode identical embeddings, induce substandard positive-negative samples, and enforce the performance responsibility on structural augmentation. Thus, previous work requires plenty of negative samples, converges slower, consumes more memory for training, and offers trivial performance improvements on downstream tasks. To overcome these issues, firstly , we introduce a novel 2dimensional (2D) feature masking technique in GCL to mask the feature matrix in 2 -dimensions. Secondly , we introduce heterogeneous augmentation, a combination of 1D -2D feature masking and edge dropping. Thirdly , we present a negative samples -based GCL framework: Heterogeneous D ata A ugmentation in Graph Contrastive Learning for E ffective N egative S amples ( DAENS ). The effectiveness of the DAENS's induced positive-negative samples can be quantified as using only inter -view negative samples improves node classification performance up to 5.84% on 8 benchmark datasets. DAENS converges faster, takes less time per epoch, and is more memory efficient than existing state-of-the-art methods. Lastly , we empirically examine the feature and structural augmentation effect on sparse and dense graphs with DAENS. The implementation of DAENS is available at https://github.com/mhadnanali/DAENS.
What problem does this paper attempt to address?