Representation learning of dynamic networks

Haixu Wang,Jiguo Cao,Jian Pei
2024-12-15
Abstract:This study presents a novel representation learning model tailored for dynamic networks, which describes the continuously evolving relationships among individuals within a population. The problem is encapsulated in the dimension reduction topic of functional data analysis. With dynamic networks represented as matrix-valued functions, our objective is to map this functional data into a set of vector-valued functions in a lower-dimensional learning space. This space, defined as a metric functional space, allows for the calculation of norms and inner products. By constructing this learning space, we address (i) attribute learning, (ii) community detection, and (iii) link prediction and recovery of individual nodes in the dynamic network. Our model also accommodates asymmetric low-dimensional representations, enabling the separate study of nodes' regulatory and receiving roles. Crucially, the learning method accounts for the time-dependency of networks, ensuring that representations are continuous over time. The functional learning space we define naturally spans the time frame of the dynamic networks, facilitating both the inference of network links at specific time points and the reconstruction of the entire network structure without direct observation. We validated our approach through simulation studies and real-world applications. In simulations, we compared our methods link prediction performance to existing approaches under various data corruption scenarios. For real-world applications, we examined a dynamic social network replicated across six ant populations, demonstrating that our low-dimensional learning space effectively captures interactions, roles of individual ants, and the social evolution of the network. Our findings align with existing knowledge of ant colony behavior.
Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the representation learning of dynamic networks. Specifically, the researchers aim to develop a new statistical model to effectively capture and compress the relationship information that changes over time in dynamic networks. The relationships in dynamic networks include not only the connections between nodes but also the changes of these connections over time. Therefore, this problem can be summarized as the following aspects: 1. **Attribute Learning**: Describe the characteristics of each node by vectors in a low - dimensional representation space. 2. **Community Detection**: Identify different community structures evolving over time in the network. 3. **Link Prediction and Recovery**: Predict possible future links and recover unobserved links. ### Specific Expression of the Problem This research represents a dynamic network as a matrix - valued function and maps it to a lower - dimensional learning space. This learning space is defined as a metric function space, which allows the calculation of norms and inner products. By constructing such a learning space, the author has solved the following three main problems: - **Time Dependence**: Ensure that the representation is continuous in time so as to be able to capture the time - evolution characteristics of the network. - **Low - Dimensional Representation**: Compress the high - dimensional dynamic network into a low - dimensional space, making subsequent analysis and modeling more efficient. - **Topological Feature Preservation**: Maintain the static and dynamic topological features of the network during the dimension - reduction process, such as community structures and interaction patterns between nodes. ### Core of the Method To achieve the above - mentioned goals, the author proposes a framework based on Functional Data Analysis (FDA). The specific steps are as follows: 1. **Representation Learning Model**: - Represent the dynamic network \(G(t)=\{V(t), E(t)\}\) as an adjacency matrix \(A(t)\) that changes over time. - Map the high - dimensional adjacency matrix to a low - dimensional representation space through the mapping function \(F: \mathbb{R}^{M\times M\times[0,T]}\to\mathbb{R}^{R\times R\times[0,T]}\). 2. **Definition of Embedding Functions**: - Each node has two embedding vectors: one is the representation of outgoing connections \(\alpha_j(t)\), and the other is the representation of incoming connections \(\beta_j\). - These embedding vectors are functions of time and can capture the behavioral changes of nodes over time. 3. **Optimization Objectives**: - Ensure the stability and generalization ability of the model by maximizing the log - likelihood function and introducing a regularization term. - Use the gradient update method for parameter estimation, and update the embedding vectors and their clustering centers alternately. 4. **Theoretical Properties**: - Establish the limiting behavior and asymptotic distribution of the dynamic embedding component \(\gamma_j\) to ensure the theoretical basis of the model. ### Experimental Verification The author verifies the effectiveness of this method through simulation experiments and practical applications. For example, in the application of ant social networks, this model successfully captures the interactions, roles among individual ants and the evolution of the social network, and the results are consistent with the existing knowledge of ant colony behavior. In conclusion, this paper proposes a novel statistical model for the representation learning of dynamic networks, which solves the challenges existing in traditional methods when dealing with dynamic networks, such as time dependence and preservation of complex topological structures.