Obtaining Better Word Representations via Language Transfer

Changliang Li,Bo Xu,Gaowei Wu,Xiuying Wang,Wendong Ge,Yan Li
DOI: https://doi.org/10.1007/978-3-642-54906-9_11
2014-01-01
Abstract:Vector space word representations have gained big success recently at improving performance across various NLP tasks. However, existing word embeddings learning methods only utilize homo-lingual corpus. Inspired by transfer learning, we propose a novel language transfer method to obtain word embeddings via language transfer. Under this method, in order to obtain word embeddings of one language target language, we train models on corpus of another different language source language instead. And then we use the obtained source language word embeddings to represent target language word embeddings. We evaluate the word embeddings obtained by the proposed method on word similarity tasks across several benchmark datasets. And the results show that our method is surprisingly effective, outperforming competitive baselines by a large margin. Another benefit of our method is that the process of collecting new corpus might be skipped.
What problem does this paper attempt to address?