Large-scale Automatic Extraction of Chinese Compound Lexical Cohesion Pairs

邱晗,周强
DOI: https://doi.org/10.16511/j.cnki.qhdxxb.2011.09.028
2011-01-01
Abstract:Lexical cohesion pairs(LCP) are pairs of content words that can reflect different syntactic relationships in real texts.Credible lexical cohesion pairs play an important role in compound word analysis,syntactic disambiguation,long distance relationship recognition,and word sense disambiguation.This paper presents a method combining basic chunk parsing and statistics-based filtering to automatically extract credible LCP from a large corpus.Extraction tests on the 36 million word People's Daily corpus extracted about 730 thousand Chinese noun-noun and verb-noun LCPs with a 35.6% abandoning ratio.
What problem does this paper attempt to address?