A Class-Aware Representation Refinement Framework for Graph Classification

Jiaxing Xu,Jinjie Ni,Yiping Ke
2024-06-06
Abstract:Graph Neural Networks (GNNs) are widely used for graph representation learning. Despite its prevalence, GNN suffers from two drawbacks in the graph classification task, the neglect of graph-level relationships, and the generalization issue. Each graph is treated separately in GNN message passing/graph pooling, and existing methods to address overfitting operate on each individual graph. This makes the graph representations learnt less effective in the downstream classification. In this paper, we propose a Class-Aware Representation rEfinement (CARE) framework for the task of graph classification. CARE computes simple yet powerful class representations and injects them to steer the learning of graph representations towards better class separability. CARE is a plug-and-play framework that is highly flexible and able to incorporate arbitrary GNN backbones without significantly increasing the computational cost. We also theoretically prove that CARE has a better generalization upper bound than its GNN backbone through Vapnik-Chervonenkis (VC) dimension analysis. Our extensive experiments with 11 well-known GNN backbones on 9 benchmark datasets validate the superiority and effectiveness of CARE over its GNN counterparts.
Machine Learning,Artificial Intelligence
What problem does this paper attempt to address?
### Problems the Paper Aims to Solve This paper aims to address two main issues in graph neural networks (GNNs) for graph classification tasks: 1. **Ignoring Graph-Level Relationships**: Existing GNN architectures process each input graph independently during training. Node representations are propagated through the GNN to generate graph representations, and the model also handles each graph independently in loss design. This leads to the complete neglect of relationships (similarities and differences) between different input graphs. For example, in molecular data, molecules belonging to the same class often share certain common substructures (such as the same functional groups), which may be class-specific. However, ignoring this graph-level information can result in poor graph representations for downstream classification tasks. 2. **Generalization Issues**: GNN models are prone to overfitting when the depth increases or the hidden dimensions expand. Although some methods attempt to alleviate overfitting by modifying input graphs, generating new graphs for adversarial learning, and contrastive learning, these methods still operate on individual graphs and fail to explore the effectiveness of graph-level information in improving generalization ability. To address these issues, the authors propose a new framework called Class-Aware Representation rEfinement (CARE). CARE introduces class representations to guide the learning of graph representations, thereby enhancing class separability. Theoretically, CARE is proven to have a lower generalization bound than its GNN backbone. Experimental results show that CARE significantly outperforms existing GNN models on multiple benchmark datasets.