Automatic Collocation Extraction Using Web Feedback Data

Dequan Zheng
2010-01-01
Abstract:To improve the precison of collocation extraction,this paper proposes a new method based on Internet data.For the constraint by the corpus scale for traditional collocation extraction approach based on linguistic corpus,we acquire collocations from Web,which contains plenty of information and knowledge.Three classical association measures of co-occurrence frequency,mutual information and χ2-test are used to automatically extract the collocation.Based on the experimental results,the benchmarks show that the performance of this new Web-based approach is superior to that of traditional approach in both precision and recall.Thus the data from Internet may be applied in many NLP applications.
What problem does this paper attempt to address?