Abstract:Graphs are ubiquitous in various fields, and deep learning methods have been successful applied in graph classification tasks. However, building large and diverse graph datasets for training can be expensive. While augmentation techniques exist for structured data like images or numerical data, the augmentation of graph data remains challenging. This is primarily due to the complex and non-Euclidean nature of graph data. In this paper, it has been proposed a novel augmentation strategy for graphs that operates in a non-Euclidean space. This approach leverages graphon estimation, which models the generative mechanism of networks sequences. Computational results demonstrate the effectiveness of the proposed augmentation framework in improving the performance of graph classification models. Additionally, using a non-Euclidean distance, specifically the Gromow-Wasserstein distance, results in better approximations of the graphon. This framework also provides a means to validate different graphon estimation approaches, particularly in real-world scenarios where the true graphon is unknown.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is: How to effectively augment graph data in non - Euclidean spaces to improve the performance of graph classification tasks. Specifically, constructing large - scale and diverse graph datasets for training is very expensive, and the existing graph data augmentation techniques are still challenging, mainly due to the complexity and non - Euclidean nature of graph data. ### Main problems 1. **High cost of constructing large - scale and diverse graph datasets**: - In practical applications, constructing graph datasets for training neural methods is very expensive. 2. **Limitations of existing graph data augmentation techniques**: - Existing graph data augmentation strategies usually operate only within a single graph (such as modifying edges or nodes) and cannot achieve information exchange between different instances. - Traditional augmentation techniques (such as those in image, video or text data) are difficult to be directly applied to graph data because graph data has a complex non - Euclidean structure. ### Solutions The paper proposes a new graph data augmentation strategy based on graphon estimation and Gromov - Wasserstein barycenter. This method uses graphon to model the generation mechanism of network sequences and proves its effectiveness in improving the performance of graph classification models through calculation results. ### Key points - **Graphon**: Graphon is the limit object of large - graph sequences and can be used to generate graphs of arbitrary size. New graphs can be created by sampling nodes from a uniform distribution and generating an adjacency matrix according to graphon. - **Gromov - Wasserstein distance**: This is a non - Euclidean distance metric, especially suitable for graph data. Using this distance can better approximate graphon and is helpful for verifying different graphon estimation methods. ### Experimental results The experimental results show that using the augmented dataset in graph classification tasks can significantly improve the model performance, especially in multi - class problems or when the multi - class distinction is not obvious. In addition, graphon estimated using Gromov - Wasserstein barycenter usually brings better performance improvement. ### Summary The main contribution of this paper is to propose a graph data augmentation framework based on graphon and Gromov - Wasserstein barycenter, which solves the limitations of existing augmentation techniques on graph data and demonstrates its effectiveness in graph classification tasks.

Graph data augmentation with Gromow-Wasserstein Barycenters

Learning to Augment Graph Structure for Both Homophily and Heterophily Graphs

Knowledge Distillation Improves Graph Structure Augmentation for Graph Neural Networks

Data Augmentation for Deep Graph Learning: A Survey

Data Augmentation in Graph Neural Networks: The Role of Generated Synthetic Graphs

Towards data augmentation in graph neural network: An overview and evaluation

Data Augmentation for Graph Data: Recent Advancements

Data Augmentation on Graphs: A Technical Survey

Metropolis-Hastings Data Augmentation for Graph Neural Networks

Through the Dual-Prism: A Spectral Perspective on Graph Data Augmentation for Graph Classification

Graph Data Augmentation for Node Classification

Exploiting Edge Features in Graphs with Fused Network Gromov-Wasserstein Distance

Efficient Topology-aware Data Augmentation for High-Degree Graph Neural Networks

A Simple Data Augmentation for Graph Classification: A Perspective of Equivariance and Invariance

GABO: Graph Augmentations with Bi-level Optimization

Null Model-Based Data Augmentation for Graph Classification

Augmenting correlation structures in spatial data using deep generative models

Fused Gromov-Wasserstein Graph Mixup for Graph-level Classifications

Towards fidelity of graph data augmentation via equivariance

On the Power of Graph Neural Networks and Feature Augmentation Strategies to Classify Social Networks

Robust Optimization as Data Augmentation for Large-scale Graphs