Parsing Chinese with a Generalized Categorial Grammar

Manjuan Duan,William Schuler
DOI: https://doi.org/10.18653/v1/w15-3304
2015-01-01
Abstract:Categorial grammars are attractive because they have a clear account of unbounded dependencies.This accounting is especially important in Mandarin Chinese which makes extensive usage of unbounded dependencies.However, parsers trained on existing categorial grammar annotations (Tse and Curran, 2010) extracted from the Penn Chinese Treebank (Xue et al., 2005) are not as accurate as those trained on the original treebank, possibly because enforcing a small set of inference rules in these grammars leads to large sets of categories, which cause sparse data problems.This work reannotates the Penn Chinese Treebank into a generalized categorial grammar which uses a larger rule set and a substantially smaller category set while retaining the capacity to model unbounded dependencies.Experimental results show a statistically significant improvement in parsing accuracy with this categorial grammar.
What problem does this paper attempt to address?