Parsing Penn Chinese treebank (CTB) with head-driven model

Hailong Cao,Tiejun Zhao,Sheng Li
DOI: https://doi.org/10.3321/j.issn:1002-0470.2007.01.004
2007-01-01
Abstract:This paper reports the new improvement of the work on parsing the Penn Chinese treebank (CTB) , one of the most important technologies of Chinese information processing. The well-known head-driven model was applied to the new available CTB5.0 and the parsing experiment was performed for the first time. Compared with the previous work on CTB, the experiment achieved more promising result and greatly narrowed the performance gap between Chinese parsing and English parsing. The parser was evaluated on the standard test set with PARSEVAL metric. It performed with the precision of 85.89% and the recall rate of 85.61% on the sentences with gold segmentation and POS tagging. The construction of the parser was described, and the functions of the two important technologies that can significantly improve the parsing performance were analyzed. This work is referential to the development of Chinese parser for real applications.
What problem does this paper attempt to address?