Named Entity Recognition in Chinese Social Media Base on the Unified Model

Li YI,Peng HUANG,YANBING PENG,Guang CHENG
DOI: https://doi.org/10.3969/j.issn.1672-9722.2017.12.017
2017-01-01
Abstract:Named Entity Recognition(NER)in Chinese social media is important with the development of the internet.Previ?ous methods focus on in-domain supervised learning which is limited by the rare annotated data.However,there are enough corpora in formal domains and massive in-domain unannotated texts which can be used to improve the task.A unified model which can learn from out-of-domain corpora and in-domain unannotated texts is proposed,the unified model contains two major functions,one is for cross-domain learning and the other is for semi-supervised learning.Cross-domain leaning function can learn out-of-domain in?formation based on domain similarity.Semi-Supervised learning function can learn in-domain unannotated information by self-train?ing.Both learning functions outperform existing methods for NER in Chinese social media.Used unified model to experiment get a better result and decrease the workload of manual tagged corpus.
What problem does this paper attempt to address?