Exploiting Lexicalized Statistical Patterns in Chinese Linguistic Analysis

Yu Zhao,Maosong Sun
DOI: https://doi.org/10.1007/978-3-642-41491-6_22
2013-01-01
Abstract:The web corpus has been used for linguistic analysis with the help of search engines. In this paper, we describe the concept of lexicalized patterns, which we exploit to obtain statistical information using the simple string matching strategy via search engines. We discuss the usage of lexicalized statistical patterns at three linguistic levels of Chinese analysis: lexical, syntactic and semantic. We develop a specialized search engine to get frequency counts for these patterns on SogouT corpus. Experimental results show that lexicalized statistical patterns are effective on analyzing the cohesion of phrases, determining the phrasal category and discovering patient objects.
What problem does this paper attempt to address?