Neural architecture for tibetan word segmentation

Mengzhu Chen,Shengjie Zhao,Kai Yang
DOI: https://doi.org/10.1109/IALP.2017.8300619
2017-01-01
Abstract:Tibetan word segmentation (TWS) is a primary task for Tibetan language processing. In this paper, a novel hybrid neural architecture is proposed to solve TWS which is considered as a sequence tagging task. Due to the high frequency of the contracted words in Tibetan sentences, we firstly use conditional random field (CRF) to deal with this problem. Then we use the character embedding method to obtain basic character representation as input. Most importantly, we apply bi-directional Long short-term memory and CRF (BiLSTM-CRF) to our system. Experimental result shows that our approach obtained state-of-art performance compared with previous approaches used in TWS.
What problem does this paper attempt to address?