Construction and Application of the Knowledge Base of Chinese Multi-word Expressions

Lei Wang,Shujing Li,Weiguang Qu,Shiwen Yu
DOI: https://doi.org/10.1007/978-3-642-45185-0_59
2013-01-01
Abstract:In a language, Multi-word Expressions (MWEs, also called “idiomatic expressions” or “set phrases”) are very common in everyday usage. Most linguists hold that MWEs be an inclusive concept that should consist of not only lexical units such as idioms, idiomatic expressions, xiehouyu, proper nouns, but also non-lexical units such as proverbs, maxims and adages. Even those that are statistically idiosyncratic are to be listed in MWEs. In NLP tasks like word segmentation and semantic role labeling remain a bottle-neck problem. Therefore, to construct a knowledge base for MWEs with relatively complete entries and tagged attributes will be an effective solution for the above-mentioned problem. This paper introduces relevant information about the construction and application of an MWE knowledge base by the Institute of Computational Linguistics at Peking University(ICL/PKU), in which the author expects to provide due help to research in this regard.
What problem does this paper attempt to address?