Learning Cross-lingual Word Embeddings Via Matrix Co-factorization.

Tianze Shi,Zhiyuan Liu,Yang Liu,Maosong Sun
DOI: https://doi.org/10.3115/v1/p15-2093
2015-01-01
Abstract:A joint-space model for cross-lingual distributed representations generalizes language-invariant semantic features. In this paper, we present a matrix cofactorization framework for learning cross-lingual word embeddings. We explicitly define monolingual training objectives in the form of matrix decomposition, and induce cross-lingual constraints for simultaneously factorizing monolingual matrices. The cross-lingual constraints can be derived from parallel corpora, with or without word alignments. Empirical results on a task of cross-lingual document classification show that our method is effective to encode cross-lingual knowledge as constraints for cross-lingual word embeddings.
What problem does this paper attempt to address?