Accurate Learning for Chinese Function Tags from Minimal Features.

Caixia Yuan,Fuji Ren,Xiaojie Wang
DOI: https://doi.org/10.3115/1667884.1667893
2009-01-01
Abstract:Data-driven function tag assignment has been studied for English using Penn Tree-bank data. In this paper, we address the question of whether such method can be applied to other languages and Tree-bank resources. In addition to simply extend previous method from English to Chinese, we also proposed an effective way to recognize function tags directly from lexical information, which is easily scalable for languages that lack sufficient parsing resources or have inherent linguistic challenges for parsing. We investigated a supervised sequence learning method to automatically recognize function tags, which achieves an F-score of 0.938 on gold-standard POS (Part-of-Speech) tagged Chinese text -- a statistically significant improvement over existing Chinese function label assignment systems. Results show that a small number of linguistically motivated lexical features are sufficient to achieve comparable performance to systems using sophisticated parse trees.
What problem does this paper attempt to address?