Parsing Chinese Based on Lexicalized Model

Hai-long Cao,Tie-jun Zhao,Sheng Li
2007-01-01
Abstract:In order to process large-scale real text, a method of building Chinese parser based on lexicalized model is proposed. First, a unified approach for segmentation and part of speech tagging is proposed based on hidden Markov model. The method not only conservers the merits of HMM which is simple and efficient but also improves the tagging accuracy. Then the head-driven model is used to recognize phrases. Head-driven model is a well-known English parsing model; we combine it with segmentation and POS tagging model and thus build a Chinese parser that can operate at the character level. The parser is evaluated on the standard test set. It achieves 77.57% precision and 74.96% recall and outperforms the only previous comparable work significantly.
What problem does this paper attempt to address?