On embedding sequence correlations in attributed network for semi-supervised node classification

Haodong Zou,Zhen Duan,Xinru Guo,Shu Zhao,Jie Chen,Yanping Zhang,Jie Tang
DOI: https://doi.org/10.1016/j.ins.2021.03.044
IF: 8.1
2021-01-01
Information Sciences
Abstract:Graph Neural Networks(GNNs) has dominated the semi-supervised node classification task by its neighborhood aggregation mechanism over traditional network embedding methods. However, GNNs still encounters with two vital problems when scales to large networks. It requires the whole graph as input and aggregates attribute only, which brings out-of-memory(OOM) problem on single machine scenery and attribute over-smoothing respectively. To tackle these issues, from the traditional view, we propose a Sequence correlation preserving method for Attributed Network Embedding (SANE) which transforms the network properties into three types of sequences and preserves the correlations among them. Firstly, SANE extracts three types of sequences in attributed network, namely node sequence, attribute sequence, and label sequence, which provide distinct insight into networks. Secondly, the proposed method preserves attribute-node sequence correlation and attribute-label sequence correlation by (1) extracting dual-directional features from attribute sequence and (2) exploiting the extracted features to decode node sequence and label sequence. SANE can scale to large networks and relieve the over-smoothing causing by attribute only aggregation. Experimental results on five real-world datasets demonstrate that SANE outperforms the state-of-the-art methods.
What problem does this paper attempt to address?