Cross-Lingual Transfer Learning for Medical Named Entity Recognition
Pengjie Ding,Lei Wang,Yaobo Liang,Wei Lu,Linfeng Li,Chun Wang,Buzhou Tang,Jun Yan
DOI: https://doi.org/10.1007/978-3-030-59410-7_28
2020-01-01
Abstract:Extensive technologies have been employed to explore a best way for cross-lingual transfer learning. In medical domain, Named Entity Recognition is pivotal for many downstream tasks, such as medical entity linking and clinical decision support systems. Nevertheless, the lack of annotation limits the applicability in many languages without enough labeled data. To alleviate this issue and make use of languages with sufficient annotated data, we find a new way to obtain medical parallel corpus from medical terminology systems and knowledge bases and propose a methodology which combines cross-lingual language model pretraining and bilingual word embedding alignment with the help of the parallel corpus. Moreover, our combined architecture which maintains the framework of pretrained model can not only be used for NER task but also other downstream NLP tasks. Experiments demonstrated that incorporating Chinese and English medical data can effectively improve the performance for an English medical NER dataset (i2b2).