Bilingual Active Learning For Relation Classification Via Pseudo Parallel Corpora

Longhua Qian,Haotian Hui,Yanan Hu,Guodong Zhou,Qiaoming Zhu
DOI: https://doi.org/10.3115/v1/p14-1055
2014-01-01
Abstract:Active learning (AL) has been proven effective to reduce human annotation efforts in NLP. However, previous studies on AL are limited to applications in a single language. This paper proposes a bilingual active learning paradigm for relation classification, where the unlabeled instances are first jointly chosen in terms of their prediction uncertainty scores in two languages and then manually labeled by an oracle. Instead of using a parallel corpus, labeled and unlabeled instances in one language are translated into ones in the other language and all instances in both languages are then fed into a bilingual active learning engine as pseudo parallel corpora. Experimental results on the ACE RDC 2005 Chinese and English corpora show that bilingual active learning for relation classification significantly outperforms monolingual active learning.
What problem does this paper attempt to address?