A BiLSTM-CRF Based Approach to Word Segmentation in Chinese

Yuanyuan Jin,Shiyu Tao,Qi Liu,Xiaodong Liu
DOI: https://doi.org/10.1109/dasc/picom/cbdcom/cy55231.2022.9927991
2022-01-01
Abstract:This paper proposes a approach for word segmentation in Chinese. The word segmentation model in this paper combines Bi-directional Long Short-Term Memory (BiLSTM) and Conditional Random Fields (CRF), and proposes a four-state word segmentation model of DSZM, so that the model can not only consider the correlation between the front and rear of the sequence like CRF, but also have the feature extraction and fitting capabilities of BiLSTM.
What problem does this paper attempt to address?