An Infinite Latent Attribute Model for Network Data

Konstantina Palla,David Knowles,Zoubin Ghahramani
DOI: https://doi.org/10.48550/arXiv.1206.6416
2012-06-28
Abstract:Latent variable models for network data extract a summary of the relational structure underlying an observed network. The simplest possible models subdivide nodes of the network into clusters; the probability of a link between any two nodes then depends only on their cluster assignment. Currently available models can be classified by whether clusters are disjoint or are allowed to overlap. These models can explain a "flat" clustering structure. Hierarchical Bayesian models provide a natural approach to capture more complex dependencies. We propose a model in which objects are characterised by a latent feature vector. Each feature is itself partitioned into disjoint groups (subclusters), corresponding to a second layer of hierarchy. In experimental comparisons, the model achieves significantly improved predictive performance on social and biological link prediction tasks. The results indicate that models with a single layer hierarchy over-simplify real networks.
Machine Learning
What problem does this paper attempt to address?
This paper attempts to solve two main problems in network data: 1. **Understanding the latent structure of the network**: The paper hopes to reveal the hidden structural features in the network. For example, in biological networks, what characteristics determine the interactions between proteins; in social networks, what are the mechanisms behind the connections or non - connections between people. 2. **Predicting the "missing" links in the network**: Another important challenge is how to predict the links that may exist but have not been observed in the network, such as whether two proteins will interact or whether two people will become friends. To address these challenges, the paper proposes a new Infinite Latent Attribute Model (ILA). This model captures the complex dependencies in network data by introducing a multi - level hierarchical structure. Specifically, each object is represented by a binary feature vector, and each feature is further divided into multiple disjoint subgroups (sub - clusters). This multi - level hierarchical structure enables the model to more flexibly describe the complex structures in the network, thereby significantly improving the performance of link prediction tasks in social and biological networks in experiments. ### Main contributions of the model 1. **Multi - level hierarchical structure**: Compared with the existing single - level hierarchical structure models, the ILA model can more delicately capture the complex dependencies in the network by introducing the sub - cluster division of features. This enables the model to better explain the network structures in the real world. 2. **Non - parametric Bayesian method**: The model uses the Indian Buffet Process (IBP) and the Chinese Restaurant Process (CRP) to infer the number of features and the number of sub - clusters within each feature, thus avoiding the need to pre - specify these parameters. 3. **Improved prediction performance**: The experimental results show that the ILA model significantly outperforms the existing models in link prediction tasks in social and biological networks, especially in predicting "missing" links. ### Specific mechanisms of the model - **Feature matrix \( Z \)**: Each object \( i \) is represented by a binary feature vector \( z_i \), where \( z_{im} = 1 \) indicates that object \( i \) has feature \( m \), otherwise \( z_{im} = 0 \). - **Sub - cluster assignment \( C \)**: The sub - cluster assignment within each feature \( m \) is represented by a vector \( c(m) \), where \( c(m)_i \) indicates the sub - cluster to which object \( i \) belongs in feature \( m \). - **Weight matrix \( W \)**: Each feature \( m \) corresponds to a weight matrix \( W(m) \), where \( w(m)_{kk'} \) indicates the link weight when object \( i \) and object \( j \) belong to sub - clusters \( k \) and \( k' \) respectively. The link probability is given by the following formula: \[ \text{Pr}(r_{ij} = 1 | z_i, z_j, C, W) = \sigma\left(\sum_m z_{im} z_{jm} w(m)_{c(m)_i c(m)_j} + s\right) \] where \( \sigma(x) = \frac{1}{1 + e^{-x}} \) is the sigmoid function, which is used to map the input to the interval (0, 1) to ensure that the result is a valid probability value. ### Experimental results The paper conducted experiments on synthetic datasets and two real - world datasets (the NIPS co - author network and the gene - interaction network). The results show that the ILA model significantly outperforms the existing models in prediction performance, especially when dealing with complex network structures.