Tibetan Functional Chunks Boundary Recognition Based on Error-Driven Learning Strategy

WANG Tianhang,SHI Shumin,LONG Congjun,HUANG Heyan,LI Lin
2014-01-01
Abstract:Tibetan syntactic functional chunk parsing is aimed at identifying syntactic constituent in Tibetan sentences that it can facilitate further analysis of sentences. According to the unique characteristics of Tibetan, the paper puts forward an error-driven learning strategy to identify the chunk boundary which is based on the description system of Tibetan syntactic functional chunk. The specific idea is as follows: we recognize the chunk boundary using the Conditional Random Fields (CRFs) model firstly. Then revise the recognition result through Transformation-based Error-driven Learning (TBL) method and the CRFs error-driven method. The F values of them increase 1.65% and 8.36%, respectively. Through the analysis of the experiments, we further combine these two error-driven techniques. In the experiment of the Tibetan corpus which contains 18073 words, the precision, recall and F value achieves 94.76%, 94.1% and 94.43% which proves that our method is effective.
What problem does this paper attempt to address?