A Novel Chinese Word Segmentation Method Utilizing Morphology Information

xu shuona,zeng biqing
DOI: https://doi.org/10.1007/978-3-642-03718-4_41
2012-01-01
Abstract:In this paper, we present a novel approach to integrate morphology information into the statistical model for CWS, which yields better accuracies than the traditional CRFs-based approach. The improvements are mainly attributed to two aspects. Firstly, the structure information within the words is integrated into the CRFs model by annotating the Chinese word corpus with morphology tags, which conveys the construction modes of Chinese words. Secondly, the training process adopts a joint CRFs model to integrate structure information with other context, which combine the morphology tag and word boundary in the same state level and complete the word segmentation and morphology tag identification complementarily. Experimental results show that the morphology information is of great use to word segmentation.
What problem does this paper attempt to address?