Abstract:The paper proposed a new syntactic annotation scheme --- functional chunk, which tried to represent information about grammatical relations between sentence-level predicates and their arguments. Under this scheme, we built a Chinese chunk bank with about two million Chinese characters, and developed some learned models for automatically annotating fresh text with functional chunks. We also proposed a two-stages approach to build Chinese tree bank on the top of chunk bank, and gave some experimental results of chunk-based syntactic parser to show the advantage of functional chunk for parsing performance increase. All these work lays good foundations for further research project to build a large scale Chinese tree bank. hundred thousand words, with an average chunk length of two words. Many automatic partial parsers proposed in the conference showed good parsing performance. But it's still very difficult to grasp the overall structure of a sentence only based on these chunk information. The paper proposed a new syntactic annotation scheme --- functional chunk, which tried to represent information about grammatical relations between sentence-level predicates and their arguments. Among this scheme, each sentence can be exhaustively partitioned into a series of non-nested, non-overlapping units, labeled with functional tags, such as subject, object, predicate, complement and so on, while any structural relations within these chunks are left implicit. Compared with Abney's chunk, the novelty of functional chunk appears in the definition of what constitutes a chunk. Abney's chunks are defined strictly from the bottom up; a unit qualifies as a chunk based on its internal make-up, regardless of any changes in the larger context, and its category is primarily determined by the category of its head word. By contrast, the functional chunks are defined strictly from the top down; a unit qualifies as a chunk based on its position in the larger context, regardless of its internal make-up, and its category is primarily determined by the grammatical relation between it and the predicate. This top-down characteristic gives more detailed information to grasp the overall structure of a sentence than Abney's chunk. Under this scheme, we built a Chinese chunk bank with about two million Chinese characters, and developed some learned models for automatically annotating fresh text with functional chunks. We also proposed a two- stages approach to build Chinese tree bank on the top of chunk bank, and gave some experimental results of chunk- based syntactic parser to show the advantage of functional chunk for parsing performance increase. All these work lays good foundations for further research project to build a large scale Chinese tree bank.

Text Chunking using Transformation-Based Learning

Improving Chinese Text Chunkings Precision Using Transformation-based Learning

Artificially Evolved Chunks for Morphosyntactic Analysis

Rule-Based and Word-Level Statistics-Based Processing of Language: Insights from Neuroscience

Hybrid Chinese Text Chunking

Chinese word segmentation as morpheme-based lexical chunking

Experiments in Learning Models for Functional Chunking of Chinese Text

Annotating the Functional Chunks in Chinese Sentences.

Statistically based chunking of nonadjacent dependencies.

Unsupervised Chunking with Hierarchical RNN

Meta-Chunking: Learning Efficient Text Segmentation via Logical Perception

Chunk, Align, Select: A Simple Long-sequence Processing Method for Transformers

Chunks Are Components: A Dependency Grammar Approach to the Syntactic Structure of Mandarin

CHUNK PARSING SCHEME FOR CHINESE SENTENCES

BTPK-based learning: An Interpretable Method for Named Entity Recognition

Neural Models for Sequence Chunking

Unsupervised acquisition of idiomatic units of symbolic natural language: An n-gram frequency-based approach for the chunking of news articles and tweets

Multidimensional Transformation-Based Learning

The equilibrium shape of InAs quantum dots grown on a GaAs(001) substrate

Capturing Paradigmatic and Syntagmatic Lexical Relations: Towards Accurate Chinese Part-of-Speech Tagging.

Tag-Enhanced Tree-Structured Neural Networks for Implicit Discourse Relation Classification.