Chinese Named Entity Recognition Based on Embedded Pinyin Information

Ping Feng,GuoLiang Li,Yingying Wang,Xing Zhang
DOI: https://doi.org/10.1109/iccece54139.2022.9712691
2022-01-14
Abstract:Recently, word fusion techniques have become increasingly popular in Chinese Named Entity Recognition (CNER), which can make full use of displaying word information and word sequence information. However, the information of Chinese pinyin are often ignored in these methods. Pinyin characterizes the pronunciation of Chinese words, which can well deal with the phenomenon of “same words with different pronunciation” (the same words with different pronunciation and different meaning) in Chinese. Therefore, this paper proposes the Py-CNER (Pinyin Chinese Named Entity Recognition) model, which specifically embeds word information and pinyin information into a dual-stream converter to enhance Chinese named entity recognition in terms of metrics. The experimental demonstrate the advantages and superiority of the fused pinyin information, which finished on Weibo, Resume, and OntoNotes4 dataset.
What problem does this paper attempt to address?