Integrate Statistical Model And Lexical Knowledge For Chinese Multiword Chunking

Qiang Zhou,Hang Yu
DOI: https://doi.org/10.1109/NLPKE.2008.4906765
2008-01-01
Abstract:Multiword chunking is designed as a shallow parsing technique to recognize external constituent and internal relation tags of I chunk in sentence. In this paper, we propose a new solution to deal with this problem. We design a new relation tagging scheme to represent different intra-chunk relations and make several experiments of feature engineering to select a best baseline statistical model. We also apply outside knowledge from a large-scale lexical relationship knowledge base to improve parsing performance. By integrating all above techniques, we develop I new Chinese MWC parser. Experimental results show its parsing performance can greatly exceed the rule-based parser trained and tested in the same data set.
What problem does this paper attempt to address?