Base Chunk Scheme for the Chinese Language

Qiang Zhou
2007-01-01
Abstract:Chunk parsing is an important technique in the natural language processing research community,whose processing basis lies in a suitable and efficient chunk scheme.In this paper,we proposed a new topology-based base chunk scheme for the Chinese language.After introducing the lexical cohesion relationships to determinate three basic topological structures,we formed a better set of principles to analyze the content cohesion of a base chunk and built an efficient bridge to link its syntactic form and semantic meaning.Based on the chunk scheme,we can greatly simplify the processing procedure to automatically extract useful base chunk annotated corpora and corresponding lexical cohesion knowledge from a large scale Chinese syntactically annotated corpus TCT.All these research work will lay good foundations for the further explorations to develop Chinese base chunk parser and lexical cohesion knowledge acquisition tools.
What problem does this paper attempt to address?