Unsupervised translation disambiguation based on maximum web bilingual relatedness: web as lexicon

PengYuan Liu,TieJun Zhao
DOI: https://doi.org/10.1109/FSKD.2009.768
2009-01-01
Abstract:This paper regards Web as a semantic lexicon and alleviates the problem of bilingual lexical knowledge acquiring. Based on mix-language web page counts, four Web Bilingual Relatedness (WBR) measurements are built. WBR measurements are evaluated by a modified Miller-Charles' dataset and it is found that the measurement based on point-wise mutual information achieves the best performance. Furthermore, this paper presents a fully unsupervised translation disambiguation method which selects the translation to maximize the sum of WBR between translation and all context words. By testing this disambiguation method on Multilingual Chinese English Lexical Sample Task in SemEval-2007, it is found that the WBR disambiguation model based on point-wise mutual information achieves the best performance, outperforms other previous work and gets the state-of-the-art results (P mar =0.451)
What problem does this paper attempt to address?