Zero-Shot Cross-Lingual Named Entity Recognition Via Progressive Multi-Teacher Distillation

Zhuoran Li,Chunming Hu,Richong Zhang,Junfan Chen,Xiaohui Guo
DOI: https://doi.org/10.1109/taslp.2024.3449029
2024-01-01
Abstract:Cross-lingual learning aims to transfer knowledge from one natural language to another. Zero-shot cross-lingual named entity recognition (NER) tasks are to train an NER model on source languages and to identify named entities in other languages. Existing knowledge distillation-based models in a teacher-student manner leverage the unlabeled samples from the target languages and show their superiority in this setting. However, the valuable similarity information between tokens in the target language is ignored. And the teacher model trained solely on the source language generates low-quality pseudo-labels. These two facts impact the performance of cross-lingual NER. To improve the reliability of the teacher model, in this study, we first introduce one extra simple binary classification teacher model by similarity learning to measure if the inputs are from the same class. We note that this binary classification auxiliary task is easier, and the two teachers simultaneously supervise the student model for better performance. Furthermore, given such a stronger student model, we propose a progressive knowledge distillation framework that extensively fine-tunes the teacher model on the target-language pseudo-labels generated by the student model. Empirical studies on three datasets across seven different languages show that our presented model outperforms state-of-the-art methods
What problem does this paper attempt to address?