A Novel Bilingual Word Embedding Method for Lexical Translation Using Bilingual Sense Clique

Rui Wang,Hai Zhao,Sabine Ploux,Bao-Liang Lu,Masao Utiyama,Eiichiro Sumita
DOI: https://doi.org/10.48550/arXiv.1607.08692
2016-07-29
Computation and Language
Abstract:Most of the existing methods for bilingual word embedding only consider shallow context or simple co-occurrence information. In this paper, we propose a latent bilingual sense unit (Bilingual Sense Clique, BSC), which is derived from a maximum complete sub-graph of pointwise mutual information based graph over bilingual corpus. In this way, we treat source and target words equally and a separated bilingual projection processing that have to be used in most existing works is not necessary any more. Several dimension reduction methods are evaluated to summarize the BSC-word relationship. The proposed method is evaluated on bilingual lexicon translation tasks and empirical results show that bilingual sense embedding methods outperform existing bilingual word embedding methods.
What problem does this paper attempt to address?