Building A Chinese Shallow Parsed Treebank For Collocation Extraction

Bl Li,Q Lu,Y Li
DOI: https://doi.org/10.1007/3-540-36456-0_41
2003-01-01
Abstract:To automatically extract Chinese collocations and build a large-scale collocation bank, we are developing a one-million-word Chinese shallow parsed treebank. The treebank can be used not only as a training set for our. shallow parser, but also as processed data from which collocations are extracted. This paper presents several issues related to this on-going project, such as our definition of shallow parsing used in Chinese collocation extraction, guideline preparation, and quality control.
What problem does this paper attempt to address?