A BERT based Chinese Named Entity Recognition method on ASEAN News

Haoyu Zhuang,Fu Wang,Songlin Bo,Yongzhong Huang
DOI: https://doi.org/10.1088/1742-6596/1848/1/012101
2021-04-01
Journal of Physics: Conference Series
Abstract:Abstract As the first step of building a knowledge graph to record the ASEAN counties’ information, we aim to conduct Named-entity Recognition (NER) on the Chinese news about ASEAN counties. We employ a Bi-directional gated recurrent unit to replace the LSTM architecture to improve both models’ effectiveness and capability in understanding polysemous words. The state-of-the-art word embedding model, BERT, has also been included to generate qualified word vectors for the NER task. Besides, we also propose a similarity-based dataset partition method to help model learning the polysemy within the Chinese news. Experiments have been done to demonstrate that the combination of such improvements can benefit the models’ performance in identifying different types of named entities.
What problem does this paper attempt to address?