Collocation Extraction with Multiple Hybrid Strategies

WANG Daliang,TU Xuyan,ZHENG Xuefeng,TONG Zijian
DOI: https://doi.org/10.3321/j.issn:1000-0054.2008.04.041
2008-01-01
Abstract:Previous research on lexical collocation extraction have considered that most statistical evaluation approaches have the same effective ness,however, analyses show that these approaches have different advantages and disadvantages so they can complement each other.This paper presents a collocation extraction approach with multiple hybrid strategies.Mutual information is used to measure the independence of two meta to discard irrelevant data.The χ2-test was found to more than reasonably depict the concurrency and foreseeability of the collocation.The log likelihood ratio is used to solve the spare data problem which limits other methods.The word-formation rules are then added to build a logical collocation extraction approach with multiple hybrid strategies.The result of the experiment shows that this method has higher accuracy and works well in practice.
What problem does this paper attempt to address?