Multi-Modal Bayesian Embeddings for Learning Social Knowledge Graphs

Zhilin Yang,Jie Tang,William Cohen
DOI: https://doi.org/10.48550/arXiv.1508.00715
2015-08-04
Computation and Language
Abstract:We study the extent to which online social networks can be connected to open knowledge bases. The problem is referred to as learning social knowledge graphs. We propose a multi-modal Bayesian embedding model, GenVector, to learn latent topics that generate word and network embeddings. GenVector leverages large-scale unlabeled data with embeddings and represents data of two modalities---i.e., social network users and knowledge concepts---in a shared latent topic space. Experiments on three datasets show that the proposed method clearly outperforms state-of-the-art methods. We then deploy the method on AMiner, a large-scale online academic search system with a network of 38,049,189 researchers with a knowledge base with 35,415,011 concepts. Our method significantly decreases the error rate in an online A/B test with live users.
What problem does this paper attempt to address?