Abstract:Network-based information has been widely explored and exploited in the information retrieval literature. Attributed networks, consisting of nodes, edges as well as attributes describing properties of nodes, are a basic type of network-based data, and are especially useful for many applications. Examples include user profiling in social networks and item recommendation in user-item purchase networks. Learning useful and expressive representations of entities in attributed networks can provide more effective building blocks to down-stream network-based tasks such as link prediction and attribute inference. Practically, input features of attributed networks are normalized as unit directional vectors. However, most network embedding techniques ignore the spherical nature of inputs and focus on learning representations in a Gaussian or Euclidean space, which, we hypothesize, might lead to less effective representations. To obtain more effective representations of attributed networks, we investigate the problem of mapping an attributed network with unit normalized directional features into a non-Gaussian and non-Euclidean space. Specifically, we propose a hyperspherical variational co-embedding for attributed networks (HCAN), which is based on generalized variational auto-encoders for heterogeneous data with multiple types of entities. HCAN jointly learns latent embeddings for both nodes and attributes in a unified hyperspherical space such that the affinities between nodes and attributes can be captured effectively. We argue that this is a crucial feature in many real-world applications of attributed networks. Previous Gaussian network embedding algorithms break the assumption of uninformative prior, which leads to unstable results and poor performance. In contrast, HCAN embeds nodes and attributes as von Mises-Fisher distributions, and allows one to capture the uncertainty of the inferred representations. Experimental results on eight datasets show that HCAN yields better performance in a number of applications compared with nine state-of-the-art baselines.

CAVIAR: Categorical-Variable Embeddings for Accurate and Robust Inference

Clustering and Prediction with Variable Dimension Covariates

Addressing Dynamic and Sparse Qualitative Data: A Hilbert Space Embedding of Categorical Variables

Multi-view Heterogeneous Fusion and Embedding for Categorical Attributes on Mixed Data.

Sufficient Representations for Categorical Variables

A Projection Approach to Local Regression with Variable-Dimension Covariates

Variational Causal Inference

On Sliced Inverse Regression with High-Dimensional Covariates

Learning Conditional Instrumental Variable Representation for Causal Effect Estimation

Optimal Categorical Instrumental Variables

Category-Adaptive Variable Screening for Ultra-High Dimensional Heterogeneous Categorical Data

Causal inference with Machine Learning-Based Covariate Representation

Easy Variational Inference for Categorical Models via an Independent Binary Approximation

VACA: Designing Variational Graph Autoencoders for Causal Queries

Hyperspherical Variational Co-embedding for Attributed Networks

Encoding high-cardinality string categorical variables

BIVAS: A scalable Bayesian method for bi-level variable selection with applications

Reducing the dimensionality and granularity in hierarchical categorical variables

Amortized Inference for Causal Structure Learning

Linked Causal Variational Autoencoder for Inferring Paired Spillover Effects

Covariance Regression with High-Dimensional Predictors