Abstract:We present network embedding algorithms that capture information about a node from the local distribution over node attributes around it, as observed over random walks following an approach similar to Skip-gram. Observations from neighborhoods of different sizes are either pooled (AE) or encoded distinctly in a multi-scale approach (MUSAE). Capturing attribute-neighborhood relationships over multiple scales is useful for a diverse range of applications, including latent feature identification across disconnected networks with similar attributes. We prove theoretically that matrices of node-feature pointwise mutual information are implicitly factorized by the embeddings. Experiments show that our algorithms are robust, computationally efficient and outperform comparable models on social networks and web graphs.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is to simultaneously consider the distribution of node attributes and their neighbor attributes at different scales in network embedding, in order to improve the performance of downstream tasks (such as node classification, regression, and transfer learning across networks). Specifically, the paper proposes two new network embedding algorithms: 1. **Pooled Attributed Embedding (AE)**: The neighbor attribute information at different scales is fused into an embedding vector through the pooling method. 2. **Multi - Scale Attributed Embedding (MUSAE)**: The neighbor attribute information at different scales is encoded separately through the multi - scale method, and then these embedding vectors at different scales are spliced together to form the final embedding representation. The core of these two methods lies in using the attribute information of nodes and their neighbors, generating sequences through random walks, and learning embedding vectors using a method similar to Skip - gram. The paper also proves that these embedding methods implicitly decompose the Pointwise Mutual Information (PMI) matrix of feature, and shows the superior performance of these methods on real - world networks in experiments. ### Main contributions: 1. **Proposing new embedding algorithms**: AE and MUSAE and their variants AE - EGO and MUSAE - EGO are introduced. These algorithms can consider the distribution of node attributes in the local neighborhood. 2. **Theoretical analysis**: The implicitly decomposed PMI matrices of these embedding methods are derived, and it is proved that the popular network embedding methods DeepWalk and Walklets are special cases of AE and MUSAE. 3. **Empirical research**: It is proved through experiments that on real - world networks, these algorithms are superior to other similar methods in predicting node attributes, computational efficiency, and transfer learning. 4. **Open - source implementation**: The reference implementations of AE and MUSAE are provided and integrated into the open - source machine learning library Karate Club. ### Problems solved: - **Joint embedding of node attributes and neighbor attributes**: Traditional network embedding methods usually only consider the network structure and ignore node attributes. The algorithms proposed in this paper can simultaneously consider node attributes and their neighbors' attributes, thus better capturing the semantic information of nodes. - **Utilization of multi - scale information**: Neighbor information at different scales is very important for understanding the context of nodes. In this paper, different - scale information is encoded separately through multi - scale methods, which improves the expressive ability of embedding. - **Computational efficiency and scalability**: The proposed algorithms perform well in terms of computational efficiency and scalability and are suitable for large - scale networks. Through these contributions, the paper aims to provide a more effective method for processing network data with attributes, so as to achieve better performance in various downstream tasks.

Multi-scale Attributed Node Embedding

Multi-Scale Node Embeddings for Graph Modeling and Generation

Discrete Embedding for Attributed Graphs

Flexible Attributed Network Embedding.

Attributed Multi-layer Network Embedding.

A Scalable Attribute-Aware Network Embedding System

Attributed Network Embedding with Micro-Meso Structure

Time- and Space-Efficiently Sketching Billion-Scale Attributed Networks

Unsupervised Representation Learning on Attributed Multiplex Network

Attributed Social Network Embedding

Graph Embedding Via Multi-Scale Graph Representations

Multiplex Network Embedding Model with High-Order Node Dependence

GAGE: Geometry Preserving Attributed Graph Embeddings

Multi-Stage Network Embedding for Exploring Heterogeneous Edges

Fast Attributed Multiplex Heterogeneous Network Embedding.

Scalable attribute-aware network embedding with locality

Deep Attributed Network Embedding by Preserving Structure and Attribute Information

RoSANE: Robust and scalable attributed network embedding for sparse networks

Aspect-Level Attributed Network Embedding Via Variational Graph Neural Networks

Community-aware Graph Embedding Via Multi-Level Attribute Integration

An Adaptive Node Embedding Framework for Multiplex Networks